From 464f107928902c76d449a034a7f6817fa9af1702 Mon Sep 17 00:00:00 2001 From: benjas <909336740@qq.com> Date: Sun, 18 Apr 2021 20:58:48 +0800 Subject: [PATCH] Add. Build the decision tree --- .../5.决策树——每次选一边.md | 21 ++++++++++++++++++ .../assets/1618749843112.png | Bin 0 -> 1310 bytes .../assets/1618749968444.png | Bin 0 -> 1605 bytes .../assets/1618750408613.png | Bin 0 -> 4516 bytes 4 files changed, 21 insertions(+) create mode 100644 机器学习算法理论及应用/李航——统计学习方法/assets/1618749843112.png create mode 100644 机器学习算法理论及应用/李航——统计学习方法/assets/1618749968444.png create mode 100644 机器学习算法理论及应用/李航——统计学习方法/assets/1618750408613.png diff --git a/机器学习算法理论及应用/李航——统计学习方法/5.决策树——每次选一边.md b/机器学习算法理论及应用/李航——统计学习方法/5.决策树——每次选一边.md index b1d513d..06036f4 100644 --- a/机器学习算法理论及应用/李航——统计学习方法/5.决策树——每次选一边.md +++ b/机器学习算法理论及应用/李航——统计学习方法/5.决策树——每次选一边.md @@ -213,3 +213,24 @@ Information gain ratio 如上面的年龄,有3个类(青年、中年、老年),![1618749717493](assets/1618749717493.png) + + +信息增益比和信息增益的区别就是除以![1618749843112](assets/1618749843112.png) + + + +### 决策树的构建 + +Build the decision tree + +ID3算法: + +- 输入:训练数据集D,特征A,阈值ε ; +- 输出:决策树T + 1. 若D中所有实例属于同一类![1618749968444](assets/1618749968444.png),则T为单节点数,并将类![1618749968444](assets/1618749968444.png)作为该节点的类标记,返回T; + 2. 若A = Ø,则T为单节点树,并将D中实例数最大的类![1618749968444](assets/1618749968444.png)作为该节点的类标记,返回T; + 3. 否则,按算法计算A中各特征对D的信息增益,选择信息增益最大的特征Ag; + 4. 如果Ag的信息增益小于阈值ε,则置T为单节点树,并将D中实例数最大的类![1618749968444](assets/1618749968444.png)作为该节点的类标记,返回T; + 5. 否则,对Ag的每一个可能值ai,依![1618750408613](assets/1618750408613.png)将D分割为若干非空子集Di,将Di中实例最大的类作为标记,构建子节点,由节点及其子节点构成树T,返回T; + 6. 对第i个子节点,以Di为训练集,以A - {Ag}为特征集,递归地调用1~5步,得到树Ti,返回Ti。 + diff --git a/机器学习算法理论及应用/李航——统计学习方法/assets/1618749843112.png b/机器学习算法理论及应用/李航——统计学习方法/assets/1618749843112.png new file mode 100644 index 0000000000000000000000000000000000000000..a58aea45ccb3d161e2ed11eab8e1173b6f140f51 GIT binary patch literal 1310 zcmV+(1>yRMP)Px#1ZP1_K>z@;j|==^1poj532;bRa{vGmbN~PnbOGLGA9w%&1hYv*K~zXfy_Z`k zpK%z+ALcNO$sERzA<9rgG-VE@kdobO=1RFyYAu&8thsO{<;q4&3l~DWa7?7+Ope3E z9CF^w*__|K-^cI$|KI=Lp}(#6sm1Tv`~KeF^E}_@d47x}8TwEB{{5TG&CSW!*jQL@ z$JW*s+1lEQl`a}bc6WE_$B!TUA()Pij>y5mfh;U6Xk}%EKW=1X#C9tyD{^vj;tdSt z`}gluRaHe985v}4Z7r-stE;PYczDR)#olLUXKXh$H6>S9R~@ITtE+jPoSZByS~QM) z`}U1$YHFyftBXubOeicYj9$EW!Om1xR#Hn#3p*Pc8cLCokrW*r%^Mg34*KrhJ9_!@ zB|UxmR9G*ltE;2t=4Kig7~nl0J$l4}TwY%0bKuYc0RjB%-rgR4`0#;+}7FJ2x-rklH5)!0WuU<(T8ymuEIC3PVyu3WI(&hB@RQmYw zqvYe`Bgw&Jo17JCe}7;4^y!lnA0IFE^z^VDba!`4nVFelrA0@qKY#vY9VR6uacUJH zt(~163Jwk?J3BjJH4MlVlDbL`3SYl|rI3&ia(8#+56f`}+C{i!KPz@bEAN1qIRFyLW|E3mjQr zUnek z(a}-<85kJIKh>O{pOY+WK0NA00W~u@IZ3kog+=8EUO?oZKYvd4_V(PF?%%)9_NApI z^78W1k()kf6z%QpVxF@6^ELx_frdaMC9v;GS0ptyNZEkJ~i^>sX z1KN~}nqp#NC@wCJVq;@jW@w2w$qJ4Z6%}zU=>u8a6aaKf9L(9-Sy;66_V#k8N=!@? zmJ1+3WCMFEXdG#4Ya=%|w@Vw2jg7JAXvfz9`QcqeUmr6wGaU)2Sy))$AfP_kU zj^G{4B{irk%$+Asp3s{&Z-iBgJOWty>M3uGL6o1LPct(!90ha`6dBf^KYynB`g(p4 z?GCzB^Xu0y)-Vp@;^M-;E0bBl)6?^kgLtl~sfmhFH^5badpxMC4H`xNaycE-upBw{NMiuuvmgTvw23OnekNY7=#YARr5VettYht|L7? zo$L7Z>({~}BOmmykY|o0XG^RkNiIyD!tymnShN%r6i6i{C1Uk&$ibE5L3!yrnqoJU zlamvwsHmXq>}-|^T U=B2Mh)&Kwi07*qoM6N<$f{s6D4FCWD literal 0 HcmV?d00001 diff --git a/机器学习算法理论及应用/李航——统计学习方法/assets/1618749968444.png b/机器学习算法理论及应用/李航——统计学习方法/assets/1618749968444.png new file mode 100644 index 0000000000000000000000000000000000000000..d9649ab44cbbc2dd124fa66f4de10b106d73cc0d GIT binary patch literal 1605 zcmV-L2DP000^Y1^@s6LVfqm00001b5ch_0Itp) z=>Px#1ZP1_K>z@;j|==^1poj532;bRa{vGmbN~PnbOGLGA9w%&1=>kOK~zXfbyiz% zR8S6QhzXX96?mv1R%`{TZD%@k?lb4+xAvJSMRCon*=O&)_O~v3t#w8_aOhnh;CG3% z;Fp81mY44d$$ot;kd?X5XV3R)^V>=Hi6mgb7_jGwumJgRT@Rix*MtLEy~grfoAZdq zQS-I`FqH@t4=YG9UvzWubm*E6!!#9Zx(S^u0zMIMXIK2Fae%a(bzSG6^Pu1>9f2p+ zY88b-5hZdyUqq!`hHX~~$b%Ze6w|C4)8r*yzOMMy!|(W>JSg{ROSJ^c3ZbW`6J4Dh z2=m#d!g9Hah15;_^82(hM3YPVHxc7bq#ZaSN*FtSs-+70?ztN;?s*oyJ#CmzWspv% z(B0h)D{P`z_VCu*AJDS`bYmHyM1|)ACaL)?Vz9ZwZ>3zs@X!EW-nWC+hjH|?FEKuT z0olcy=wH{1(O32{Ts9Md@fV{cwt~3?Nbf{%0Jo%1NH>iTP<61TvlV;yjG(0{iDSpV z$BA!$M6O^{AdbtE)A(cR2IezaluBh7x)CT6B>76{)SO!6xi-{UcPb97kcpk!w_`j=G&IHuyO`QU0fs{3g*)nD_5Y0tjZx0$9 zEESR~SEgXkzue-B<3zXVDT9qZKg-MON(h>K;nTY4HV0q25DHc zrd>@hUA}~DKCd=d21;C0;gE`EnXyKs14v3&N+sBpG8Iw@Ju@^hixvClU#? ztZGsVsdR>*a;ZEZStrT}ERIW2d7jTy4r63wD@I3Oz^>gpR2T3(ddM@%e#%ScNzDcI z#PjinIFc(@BF{`p&!-UzvH!3G2&r~JFGO6PWop|MEG}m7_@<2*-q0_wyrG8aYKR0zFe#bI*!BkQmWO5&a85=kXb;{_HlOnJll`W;#{PW zIaQ>#Lpf2TaA~ZBOJeMBBNmULu`x;4WnEf~^e+bk0|OY@z6H@}jC(FSDzLVzi=ElW z>}-mKYhlZlN3ehIE1#90t_X(jA7VK2497ewl{|)q z`>_9|XVBl*4cBudGLV#Cci+88&&^=+>Td+|F*LXyJDz%!g`A+AvXYAAM)=`q+T{|) z&i;&b>w7V>?FlruHRJEO6gt~m@xagq#MmIe`eqvWTn6ufW^CU4 zB<|{H$I(xZ;fJvc%tCrAkx+>fkoNkK_k72-*#x_=@xfsRpGG1cg%xYSM<0HSiOZ7+g;((1Bk3M=X_aXlO}QY0DL6It zCnhJas<;Ov0?a5DpcS&zv)tcmT5XvgzCQjPy7@ZTyy-DiD|XN@B}BX{`3Sg>L44Mc z4wKa!4n+}*BoJSbBsVHv5p6)!iYj5AS1qNzqYVzbaiLJb#Kf;i#w`poIkMR-63JvO zeI-L#puBX+6iH+SqDGdvfO)KLVM=qI;PM~s|+JhBI%gR6>xR>Iz#89SghjI*f<>CL32wx z&Y!=iUIhfLk$};wH{N`I30A0%CqWUkj9i2 zm^)G3o4O;nNLmXw`Bk9o`2r@PR2N=5CnMuR?=<`ag+<6|Ryg>U00000NkvXXu0mjf D!TRTe literal 0 HcmV?d00001 diff --git a/机器学习算法理论及应用/李航——统计学习方法/assets/1618750408613.png b/机器学习算法理论及应用/李航——统计学习方法/assets/1618750408613.png new file mode 100644 index 0000000000000000000000000000000000000000..f3add7acf99411f801c9e2168f210ba4e1e45da8 GIT binary patch literal 4516 zcmV;V5nJwwP)Px#1ZP1_K>z@;j|==^1poj532;bRa{vGmbN~PnbOGLGA9w%&5jjajK~!i%-g~{wOwZQC0K;Oj4-7Cv*ajHHCJYinEMptpNsRA7QYrhqIsM+d z_uPBWcF#HYjrYf2{dxffAoKwbcLDYlJpJ%lcLTi#_U_$%)UP)b3dsAEyO4*k7pIt# zay||PFK2SX?=iprK-Z!8e!=sRC(4t|70aOX?)=P0;CYA!0eqkC$`+Uv zjN~^4oKC0>-U8~4cPT@7caZ8l1fUZlHIu@`*aQmM(ArFizhN3QN0aMOADMJW1!gDV z=Vz_!t{^AN8y??ytPADKmV?!PMb!(O}1~RMUbKIs>N}gM@gj_5aW>6I?VDGMtSg~{-;*5Kh z5tmE=YW{y%=vM*F{sKM z5R2&EUDS6<6y6CQbAiM^_tMMofB5`Mq~T;t3BOw$^_a@BG)uUUoZ*|bd% zj|Pri^^5*f1Tpr@&}iJvpyXX@Q}=Dp(XEJ9F%TW4MAKxXEYJXy!HLU+APR}Ze!W|6 zDAL+I8l#K%NF<6}jv1tG+es2bL7*yu+cpB)aPc0<%r@QeG}h8A*XV zFJzq0W%!bh66vM+QgU^3iqFz&l3NkX3*zv%Zrg-tG(u%jAtW+VS%ntMWM|R3VGR~G z*5M}?-o@a+Et@Q$_GK?YX;M>y8~%yWn3A<6q^v8<(OW2q$GD4}FvF~$Hq7%mPiqSc zArl)hB!U6myh<}_4$Ui`%(d98Dx%=RO`ahslN!aM1@(CN@P0h;*dut1_|TyaEMMAy z1ADe&*=L$KFSRKTl{0A$qWE-=3U6*5f7!An=20e7k-(?*lsJ~NZ{L9(c;ly6@cw&O z*sg&sQm`8)~XhS6zvE-i689L=o9WtgB@^ z-|K6uQOEwO$~ebryHQo-(dJW#v(2YKCX54;koC@Pby=($@q}+I9Sh- zAj&POgforxEo<@kZ$Exb7+^g&i~SuN@wwv$%qw{(ng1B*Q zHL#SIC%-yGc*MM7D%*qgzqRLSFNm4^bX)! zS3j=b;GKB=CefbN=CG?vG;F^SX1Dom+!UW*OsZYhpNd^bnTJ zufcbpe;IFFc$*riL^_i}(}HSj+PD^deFHfE`rDYzH@z7qxE35I-i*KT~whoEv3M^gHgtg5pFgZC(1>V2}?~$m= zX>nQPmorn9PGM@+oEB4y3e?JWr*+j}gIEyxE)L)5?xkY-(A6_Y9B|lOLMo1U<0h4; z=iWhFyV;9@;Zb^G)TRn})YQXmuO}#GW$A(2w=`pK$0ppoIfxg}zlDj6PyVCGW>Z+Z zx(VyotwHCteq4I5lRPuZ)0=XPyVcbZJpAB(tX#1KJ^jN}c$!WTVTevkQr3UIP8C{= z?#}Jo-o6#fS1w0IMFJ;JoWhwe{0^RY;&JTi*u@yb7|1dvV4MtCoGGrDJ>*1Gh+JHj z(V00Q0~QL|Zh=>&fdvnVWj_YyfyH~k0Wr=SRn`p2V7?qPh8{5}WLd}Mh#_cih9Ax0 zJl`e6;w~_;LR7>8thjG6#{pOA!4s1cjO8)*hjfk@?pwN4&W3B;y18r)v1nW@Q5KJ$ zTZ5K$&5Vup%xOy1nmvc5i<{v6$=Cj>puG*h0-4zB(t%E&I;OISR>%|Ss!^I}=Z+R68>(4lzm5+->gJXi<(3)3 zjhnRQ%!c&jYZ?t*iU!IhPL%Q^KQk$g-xuN z<=xt4foG)iP{qYr)md9tQFu#N!FZ7AniOU&t*Pr4_WzRA{mg z&1R@f#x4bJD&~QMd(BIGdWJAQK92TnZCKPa53j!VCS%rsl@J1LdwFdMB&-CSoSMWl z&wK+B6~2~*33Fus4cK8;kWr#Y(*Ud^>|MsNY{|>b!VQ;Zp`7J*C%^OOCz&8TB$G8b z@W2tvuCsLbp1ywE?Cxg!Le3e-$W#jRl1V&rc)Ka@nSVZukFH)vBo;T%f9LYg;3t!n zR0dtNz5)mKZO4LS9Rv9wre}0Ep6hxMgmu4Fe!=e$pMorYB5r+8+~dNGIG zA#=!p +&-DH*cS1^&ss3ZkE%!XPh@=RRnn|Otr=NlYyLWEJna>}?V`q*tTvW3R zy2`cl3^_5ZS-TpiK6jM6U^SCV7M&k;B9%(BM4H9a^duF}ou_%p+w5!z|MMhnUbPaB zo_-YTT2>&%(r(L^&1hM_j?Boa^f`LNs+FtJv3rk|3I%F^&))r5FnUWB8c z-GSczF`PetfqE$*QIlXGj$nK&%@ELS&Qefmp>!|bAQSRt34Gfd*`|`iXGXS!W9{Xr zHisJH#uZWXNqG{Epl5rRk+^P#hQ9FTWnAqUMSVjZwsCiiPfXxj|N0!2)P><&6D(ua z;qk}5fYI>@eDL#5@>rmPD^XKZXJypz$S8LqV=L!Ua?!Hh}li3Lqb4%U|su^zA-=*2C@jl&P@#-qnO@XddF9Y4AF4!vlG zbyWlY@JlB!F`dUheDmKiY$XrRj;25bcM9mV=ol$(M=8}5!@oPc$-vPF=APz2V-=5e zx3M;jwy|4BrPTZd^X;UO$)zwb(2LYKw>4w$qD2j;uBm3yn5A3;EbO(ja@f149mfvu z!4F=zfOD_?m?401lrrRfdK!5S8Wy)beY0-}-F<_M0Ym5;80LKdlQSs>7kUEyRWWjW za*|f)LGQpI`UZ#SZG*_l5{!*)8&+ZU$|ZRD^|z?_A;!}z_T9e&o7>jl!o~OT!G~9E ziO7{NM;P zJ=tRX3>iX(n8|zl2kfqcMWFgHGlBhkJ5Za5;gwh3z!0kr)tOdEb28c7A0C^uL{X8b z<+e+rsyd0vL>+=yh3%q{%!Q=%88{QQ^{A|>V}C8GYHFzyms~iXL6%3NDb`mlE$evR zY&WZqjE-ALVnP_W1fr`P%G=7H?*%0qryzHgx0gUqWYM+Kh;hnQE&i)&=}i^X04W$Y z;+2(1Q0@vI9u!98!KxT)cqX6ZRv#N5MeD{5+==t(fh>@UOEBD!0V`Tr-{=bnGpY!D z9nDI2Zecxp_Dp72%7F9{Dp&Dwcx2KhJ9gqQp8SI^;>iAX4EFM*GBwL=Ji+YeJkEi( zukh1>a%J5OOuc~;Vuq_`nQ=4baXRa(8c+t-2jwu&(+U-xFUTXA%ozWxsL&WIk%qzgNRh;zy1hkTtH4n4~-c6Pw54Z1$K#RF|Q?FFdZIEE*PS&uvX2qWKGe~8ou#_J8(23)G2QK~WDiaN3NrX&lF-<|Y z*BZ`~KRK3!);!KxO3&p;QlrTH^CWsyRZTrhgC;c2Z^Y{6RoL3T4HZ?@ zIQzmmX5Z^nt}?k4=+3D=6)2;*`E+;a)}KSS=AG@j?3*P)#fp1!z4B+H;EU3fJWHTJ z6VRLVX4zkN7rTsJ{ZfX@;8K|~Y;g_B=Lk8Tflo>oos~+`neeGU`+89`cMt5=$p9o7MvqY98TFTUn0RCl8yEC7JV0F%adMPM^3^xr`mzB%{de;9!~;y^D)lNgO&O zlYh}rfuaoB_dLQa%nnUp8ZJj!Il{D_MMXSbhy4b@Ww2sKb0frggxU zvh>lxSk