çžå€ãããæè€åº·æ¯ ããŒãããäœãDeep Learning âPythonã§åŠã¶ãã£ãŒãã©ãŒãã³ã°ã®çè«ãšå®è£ ã(O'REILLY) èªè éå®ã®èšäºã§ããããã«ãã¿ãã©ã€å¹æãã«ãªã¹çŸè±¡ã«ããçšåºŠã®ç¥èãšèå³ããã人ãšããããšã§ãããã«èªè ã¯éå®ããããã§ãããããã°ã§ãããè«æãçŽèŠã«æžãããšèšããããã§ããããããŸã§æ·±æãã§ãããããããªãã®ã§ããšããããããã°ã«æããŠããŸããä»åãæ°çãç®æ±ããé¿ããããæ¥ä»ãããã®ãŒã£ãŠå ¬éããŠããŸãã
ãŒãããäœãDeep Learning âPythonã§åŠã¶ãã£ãŒãã©ãŒãã³ã°ã®çè«ãšå®è£
- äœè :æè€ åº·æ¯
- çºå£²æ¥: 2016/09/24
- ã¡ãã£ã¢: åè¡æ¬ïŒãœããã«ããŒïŒ
ååããã®2ãã§ã¯ãæä»çè«çåEORãå®çŸããããšããäžéå±€ã®åœ¢ç¶ãéã¿2Ã4+4Ã2èŠçŽ ããã€ã¢ã¹4+2èŠçŽ ã®2å±€ãã¥ãŒã©ã«ãããã¯ãŒã¯ã«ãããŠãã¬ãŠã¹ååžã«åºã¥ãä¹±æ°ã«ããäžããéã¿ã®åæå€ããå°æ°ç¹ä»¥äžäžæ¡ãã€äžžããŠãã£ããšãããç¹°ãè¿ãåŠç¿ã«ãããæ£è§£çãšæ倱é¢æ°ã®å€ã®ã°ã©ããæ¿ããå€åããããšãè¿°ã¹ãã
åæ§ã®çŸè±¡ã¯ãäžéå±€ã®åœ¢ç¶ãéã¿2Ã3+3Ã2èŠçŽ ããã€ã¢ã¹3+2èŠçŽ ã®ãã¥ãŒã©ã«ãããã¯ãŒã¯ã§ã芳枬ãããã
ããªãã¡ np.random.randn() ã¡ãœããã®ä»£ããã«ããŸããŸãã³ãããW1ãW2ã®åæå€ãçŽæ¥æ°å€ã§èšè¿°ãâŠ
>>> W1
array([[ 0.00739552, -0.01348939, -0.01178099],
[ 0.00189079, -0.00239779, 0.01830071]])
>>> W2
array([[-0.01346973, 0.01634472],
[ 0.01377876, -0.00612065],
[ 0.00380564, 0.02487122]])
np.round() ã¡ãœããã§1æ¡ãã€äžžããŠæ£è§£ç acc ãšæ倱é¢æ°ã®å€ loss ã®ã°ã©ããæãããã®ã§ããã
å·ŠïŒdecimals=7ãå³ïŒdecimals=6ã
å·ŠïŒdecimals=5ãå³ïŒdecimals=4ã
å·ŠïŒdecimals=3ãå³ïŒdecimals=2ã
éã¿2Ã4+4Ã2ããã€ã¢ã¹4+2ã®ãšããããã°ã©ãã®å€åããããæ¿ãããªããïŒ
ç¹ã«decimals=4â5â6ãããããããã1äžåã®1ã10äžåã®1æªæºã®å·®ãããªãã¯ãã§ããã
ãªãæ®éã¯å°æ°ç¹ä»¥äžã®äžžãæ¡æ°ãå€ããããªã©ãšãããä»ã®ãã€ããŒãã©ã¡ãŒã¿ã調æŽããŠåæãæ©ããããšèããã§ãããããšãåååæ§ã«æ³åãããã£ãŠã¿ãã
ããªãã¡éã¿W1ãW2ã®åæå€ãžã®ä¹æ° weight_init_std ãã倧ããããŠãã£ãã®ã ã
å·ŠïŒweight_init_std=1.ãå³ïŒweight_init_std=2.ã
å·ŠïŒweight_init_std=5.ãå³ïŒweight_init_std=10.ã
å·ŠïŒweight_init_std=20.ãå³ïŒweight_init_std=50.ã
ããã¯ããã§ãäœãèµ·ããŠãããèå³ãããããã ãåæãæ©ããåçš®ãã€ããŒãã©ã¡ãŒã¿ã®æé©åããã£ãŠãã人ã¯ãå±±ã»ã©ããããããç§ãããããšããŠããã®ã¯ããã ã®æä»çè«çåã âŠããããèšã£ã¡ãããããŸãããã£ãŠå¥ŽããªïŒ
ã«ãªã¹âæ°ããç§åŠãã€ãã (æ°æœ®æ庫)
- äœè :ãžã§ã€ã ãºã»ã°ãªãã¯
- ã¡ãã£ã¢: æ庫
ãšãããããŸãã¯å éšã§äœãèµ·ããŠããã調ã¹ãã®ã«ãéã¿W1ãW2ãšãã€ã¢ã¹b1ãb2ãã°ã©ãåããããšæãã€ããã
äžé1å±€2Ã3ã2å±€3Ã2ãšã¯èšã1å±€ã®1è¡ç®ãš2è¡ç®ã2å±€ã®1åç®ãš2åç®ã¯æåž«ããŒã¿ã®1åç®ãš2åç®ã«å¯Ÿå¿ããŠããç¬ç«ããŠæ±ããã¯ãã ïŒããšã§èŠæ€èšŒïŒã
ã€ãŸã1å±€ã®1è¡ç®ã2å±€ã®1åç®ã ãã«çç®ããã°ã3次å ãšããããšã§èŸãããŠã°ã©ãåã§ããã
ãã£ãŠã¿ãã
ãŸãã¯1å±€ç®ã®éã¿ã®1è¡ç® W1[0][0]ãW1[0][1]ãW1[0][2] ããã3Dæãç·ã°ã©ããš3é¢å±éå³ã«æãããšãè©Šã¿ãã
ã³ãŒããåæ²ãããæ·±ãæå³ã¯ãªãããå€æ°åãã¡ãã£ãšçãã«å€æŽããŠããã
ãã ããããŸã§ãšåæ§ã å®è¡ã«ã¯ O'REILLY ã® GitHubãªããžã㪠ããããŠã³ããŒãããã©ã€ãã©ãªïŒããŸããããviiãix åç §ïŒãšåããã£ã¬ã¯ããªã«ç§»åããããšãå¿ èŠã
#ã³ãŒã3-1
import sys, os
sys.path.append(os.pardir)
import numpy as np
from common.functions import *
from common.gradient import numerical_gradient as n_g
x_e = np.array([[0, 0], [1, 0], [0, 1], [1, 1]])
t_e = np.array([[1, 0], [0, 1], [0, 1], [1, 0]])
weight_init_std=0.1
W1 = weight_init_std * np. array([
[ 0.07395519, -0.13489392, -0.1178099 ],
[ 0.01890785, -0.02397794, 0.18300705]])
W2 = weight_init_std * np. array([
[-0.13469725, 0.1634472 ],
[ 0.13778756, -0.06120645],
[ 0.03805643, 0.24871219]])
b1 = np.zeros(3)
b2 = np.zeros(2)def predict(x):
A1 = np.dot(x,W1) + b1
Z1 = sigmoid(A1)
A2 = np.dot(Z1,W2) + b2
y = softmax(A2)
return y
def loss(x, t):
y = predict(x)
return cross_entropy_error(y, t)
def acc(x, t):
y = predict(x)
y = np.argmax(y, axis=1)
t = np.argmax(t, axis=1)
accuracy = np.sum(y == t) / float(x.shape[0])
return accuracy
loss_W = lambda W: loss(x_e, t_e)
loss_list, acc_list = [ ], [ ]
data_list = [[ ] for i in range(3)]
l_r , s_n = 5.0, 70
ããšã§ã³ãŒããåç §ãããšãã®äŸ¿å®ã®ãããåå²ããŠä»¥äžãã#ã³ãŒã3-2ããšåŒç§°ãããAnaconda ããã³ããã®å¯Ÿè©±ã¢ãŒãã«ã¯é£ç¶ããŠè²Œãä»ããã°ããã
import matplotlib.pyplot as plt #ã³ãŒã3-2
for i in range(s_n):
W1 -= l_r*n_g(loss_W, W1)
b1 -= l_r * n_g(loss_W, b1)
W2 -= l_r * n_g(loss_W, W2)
b2 -= l_r * n_g(loss_W, b2)
loss_list.append(loss(x_e,t_e))
acc_list.append(acc(x_e, t_e))
for k in range(3):
data_list[k] .append(W1[0,k])
3次å æãç·ã°ã©ãã®äœææ¹æ³ã¯ã西äœå·¥æ¿ ããã®ãµã€ããåç §ãããŠããã ããŸãããããããšãããããŸãã
ã#ã³ãŒã3-1ããã#ã³ãŒã3-2ãã«ç¶ããŠæ¬¡ã®ã#ã³ãŒã3-3ãã察話åããã³ããã«ã³ãããããšâŠ
#ã³ãŒã3-3
from mpl_toolkits.mplot3d import Axes3D # 3Dã§ããããfig = plt.figure()
ax = Axes3D(fig)
ax.plot(data_list[0], data_list[1], data_list[2], "o-")ax.set_xlabel('W100') # 軞ã©ãã«
ax.set_ylabel('W101')
ax.set_zlabel('W102')plt.show()
ãããªã°ã©ãã衚瀺ãããã¯ãã§ããã
ã°ã©ãã®å§ç¹ãšçµç¹ããããã«ããããå§ç¹ã¯åº§æšåç¹è¿ããçµç¹ã¯ã°ã©ãæç»åŸã«W1ããã³ããããšããã«ãã
>>> W1
array([[ 3.67330855, 0.74832202, -4.15234076],
[-4.11706592, 0.34038046, 3.74518328]])
ããªãã¡1è¡ç®ã¯ (3.6, 0.7, -4.2) ããããç®æããŠåæããããšããŠããïŒããã«èŠããïŒã
ããŒãããäœãDeep LearningãP177 å³6-8 ãæç»ããã¹ã¯ãªãã "optimizer_compare_naive.py" ãæ¹é ããŠã2次å 3é¢å±éå³é¢šã®ã°ã©ããæç»ããŠã¿ããã¹ã¯ãªãã㯠O'REILLY ã® GitHubãªããžã㪠ããããŠã³ããŒãã§ããã
ã°ã©ããæç»ããããŒã¿ã¯ä¿æãããŠããã®ã§ãäžæ²ã#ã³ãŒã3-3ãã§æç»ããã°ã©ããéããçŽåŸã«ãäžèšã#ã³ãŒã3-4ããã³ããããã°âŠ
#ã³ãŒã3-4
plt.subplot(2, 2, 2)
plt.plot(data_list[0], data_list[2], 'o-')
plt.xlabel("W210")
plt.ylabel("W220")plt.subplot(2, 2, 3)
plt.plot(data_list[1], data_list[2], 'o-')
plt.xlabel("W210")
plt.ylabel("W220")plt.subplot(2, 2, 4)
plt.plot(data_list[0], data_list[1], 'o-')
plt.xlabel("W200")
plt.ylabel("W210")
plt.show()
å·Šäžãã#ã³ãŒã3-3ãã°ã©ãã®å·ŠåŽé¢ãå³äžãåå³åŽé¢ãå³äžãååºé¢ãžã®å°åœ±ã2次å ã°ã©ãåãããã®ãšãªãã
ãªããã°ã©ããéããåŸã§æ¬¡ã®ã#ã³ãŒã3-5ãã貌ãä»ããã°âŠ
#ã³ãŒã3-5
x = np.arange(len(loss_list))
plt.plot(x, loss_list, label='loss')
plt.plot(x, acc_list, label='acc', linestyle='--')
plt.xlabel("iteration")
plt.legend()
plt.show()
æ£è§£ç acc ãšæ倱é¢æ°ã®å€ loss ã®ã°ã©ãã衚瀺ããããä»ã©ã®ããŒã¿ãæ±ã£ãŠããã確èªçšã«éå®ããããšãããã®ã§ã
次ã«éã¿ b1[0]ãb1[1]ãb1[2] ã®ã°ã©ããæç»ããã³ãŒãã ããã#ã³ãŒã3-1ãã«ç¶ããŠè²Œãä»ããå¿ èŠããããã#ã³ãŒã3-1ããåã³è²Œãä»ããã®ã¯ããŒã¿ããªã»ãããããããªã®ã§ããã§ã«1床以äžã°ã©ããæç»ããŠããã°ç°¡ç¥çãšããŠæ¬¡ã®ã#ã³ãŒã3-1'ãã貌ã£ãŠãããã
#ã³ãŒã3-1'
W1 = weight_init_std * np. array([
[ 0.07395519, -0.13489392, -0.1178099 ],
[ 0.01890785, -0.02397794, 0.18300705]])
W2 = weight_init_std * np. array([
[-0.13469725, 0.1634472 ],
[ 0.13778756, -0.06120645],
[ 0.03805643, 0.24871219]])b1 = np.zeros(3)
b2 = np.zeros(2)loss_list, acc_list = [ ], [ ]
data_list = [[ ] for i in range(3)]
ã°ã©ãã«ããb1ãèšé²ããã#ã³ãŒã3-6ã ã
#ã³ãŒã3-6
for i in range(s_n):
W1 -= l_r*n_g(loss_W, W1)
b1 -= l_r * n_g(loss_W, b1)
W2 -= l_r * n_g(loss_W, W2)
b2 -= l_r * n_g(loss_W, b2)
loss_list.append(loss(x_e,t_e))
acc_list.append(acc(x_e, t_e))
for k in range(3):
data_list[k] .append(b1[k])
b1ã®3次å æãç·ã°ã©ããæç»ããã#ã³ãŒã3-7ãã
2è¡ç®ã®ã³ã¡ã³ãã¢ãŠãã¯ãAnaconda ããã³ããã®å¯Ÿè©±ã¢ãŒãã«å ¥ã£ãŠ3次å ã°ã©ããæç»ããã®ãåããŠã§ããã°ãå®è¡ããå¿ èŠãããã
#ã³ãŒã3-7
#from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = Axes3D(fig)
ax.plot(data_list[0], data_list[1], data_list[2], "o-")ax.set_xlabel('b10')
ax.set_ylabel('b11')
ax.set_zlabel('b12')
plt.show()
ã#ã³ãŒã3-1ããŸãã¯ã#ã³ãŒã3-1'ããã#ã³ãŒã3-6ããã#ã³ãŒã3-7ããç¶ããŠè²Œããšã次ã®ãããªb1ã®3次å æãç·ã°ã©ãã衚瀺ãããã¯ãã ã
ç¶ããŠäžèšã#ã³ãŒã3-8ãã貌ããšã2次å 3é¢å±éå³é¢šã®ã°ã©ãã衚瀺ãããã¯ãã
#ã³ãŒã3-8
plt.subplot(2, 2, 2)
plt.plot(data_list[0], data_list[2], 'o-')
plt.xlabel("b10")
plt.ylabel("b12")plt.subplot(2, 2, 3)
plt.plot(data_list[1], data_list[2], 'o-')
plt.xlabel("b11")
plt.ylabel("b12")plt.subplot(2, 2, 4)
plt.plot(data_list[0], data_list[1], 'o-')
plt.xlabel("b10")
plt.ylabel("b11")
plt.show()
3Dæãç·ã°ã©ãã§ã¯çŽç·ã®ããã«èŠããŠããããå®ã¯éäžã§æãè¿ããŠããããšãããããå®ã¯ããã¯å€§å€éèŠãªæ å ±ã§ãã«ãªã¹çè«ã§èšããšããã®ãã¢ãã©ã¯ã¿ããšãããã®ã®ååšã瀺åããããã«æãããããããèšãã®ã¯ãŸã æ©ããïŒ
åèãŸã§ã«ãã°ã©ãæç»åŸã® b1 ã®ãã³ãã瀺ãã
>>> b1
array([3.02969873, 3.49216726, 3.06368501])
2å±€ç®ã®éã¿W2[0][0]ãW2[1][0]ãW2[2][0]ãæç»ããã#ã³ãŒã3-9ããW1ã®ãšããšéã£ãŠåæ¹åãã»ããã«ãªãããšã¯ãè¡ååŒãŸãã¯å³è§£ã«ãã説æãå¿ èŠããç¥ããªããä»ã¯å€±ç€Œããã
ã°ã©ãæç»ããŒã¿åæåã®ããã#ã³ãŒã3-1ããŸãã¯ã#ã³ãŒã3-1'ãã«ç¶ããŠè²Œãå¿ èŠãããã
#ã³ãŒã3-9
for i in range(s_n):
W1 -= l_r*n_g(loss_W, W1)
b1 -= l_r * n_g(loss_W, b1)
W2 -= l_r * n_g(loss_W, W2)
b2 -= l_r * n_g(loss_W, b2)
loss_list.append(loss(x_e,t_e))
acc_list.append(acc(x_e, t_e))
for k in range(3):
data_list[k] .append(W2[k,0])
ã#ã³ãŒã3-1ããŸãã¯ã#ã³ãŒã3-1'ããšã#ã³ãŒã3-9ãã«ç¶ããŠã次ã®ã#ã³ãŒã3-10ãã貌ããšâŠ
#ã³ãŒã3-10
#from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = Axes3D(fig)
ax.plot(data_list[0], data_list[1], data_list[2], "o-")
ax.set_xlabel('W200')
ax.set_ylabel('W210')
ax.set_zlabel('W220')
plt.show()
次ã®ãããªW2ã®3次å æãç·ã°ã©ãã衚瀺ãããã¯ãã ã
ã°ã©ããéããçŽåŸã«ã次ã®ã#ã³ãŒã3-11ãã貌ããšâŠ
#ã³ãŒã3-11
plt.subplot(2, 2, 2)
plt.plot(data_list[0], data_list[2], 'o-')
plt.xlabel("W210")
plt.ylabel("W220")plt.subplot(2, 2, 3)
plt.plot(data_list[1], data_list[2], 'o-')
plt.xlabel("W210")
plt.ylabel("W220")plt.subplot(2, 2, 4)
plt.plot(data_list[0], data_list[1], 'o-')
plt.xlabel("W200")
plt.ylabel("W210")
plt.show()
W2ã®3é¢å±éå³é¢šã®ã°ã©ãã衚瀺ãããã¯ãããã¯ãæãè¿ãã芳枬ãããïŒ
ã°ã©ãæç»åŸã®W2ã®ãã³ãã§ããã
>>> W2
array([[ 7.31684671, -7.31397172],
[ 5.81499619, -5.80733808],
[ 7.40499364, -7.37631678]])
2å±€ã®éã¿ b2[0]ãb2[1] ã¯2èŠçŽ ãªã®ã§ã°ã©ãæç»ã¯æ¯èŒçã©ã¯ã§ãããã#ã³ãŒã3-1ããŸãã¯ã#ã³ãŒã3-1'ãã«ããåæååŸã次ã®ã#ã³ãŒã3-12ãã貌ãä»ãããšâŠ
#ã³ãŒã3-12
for i in range(s_n):
W1 -= l_r*n_g(loss_W, W1)
b1 -= l_r * n_g(loss_W, b1)
W2 -= l_r * n_g(loss_W, W2)
b2 -= l_r * n_g(loss_W, b2)
loss_list.append(loss(x_e,t_e))
acc_list.append(acc(x_e, t_e))
for k in range(2):
data_list[k] .append(b2[k])
plt.plot(data_list[0], data_list[1], 'o-')
plt.xlabel("b20")
plt.ylabel("b21")
plt.show()
次ã®ãããªã°ã©ãã衚瀺ãããã¯ãã ã
b2ã®ãã³ããäœåºŠãã£ãŠãæå°æ¡ãŸã§åãæ°åã«ãªãã¯ãã§ããã
>>> b2
array([-17.26072032, 17.26072032])
ãã ãæ¯ååæåãããã®ã¯é¢åãªã®ã§äžåºŠã«W1ãb1ãW2ãb2ã®ã°ã©ãæç»çšããŒã¿ãèšé²ã§ãããããç®äžã³ãŒããæ¹é ããŠãããšããã§ããã
ã¹ãã³ãµãŒãªã³ã¯