国产无码网页在线观看,久久国产亚洲精品日韩一区在线播放 ,亚洲av片不卡无码久久wy193

如何避免這8個(gè)常見(jiàn)的深度學(xué)習(xí)/計(jì)算機(jī)視覺(jué)錯(cuò)誤？

2019-10-28 09:47

人是不完美的，我們經(jīng)常在程序中犯錯(cuò)誤。有時(shí)這些錯(cuò)誤很容易發(fā)現(xiàn)：你的代碼根本不能工作，你的應(yīng)用程序崩潰等等。但是有些bug是隱藏的，這使得它們更加危險(xiǎn)。

在解決深度學(xué)習(xí)問(wèn)題時(shí)，由于一些不確定性，很容易出現(xiàn)這種類型的bug：很容易看到web應(yīng)用端點(diǎn)路由請(qǐng)求是否正確，而不容易檢查你的梯度下降步驟是否正確。然而，在DL從業(yè)者生涯中有很多錯(cuò)誤是可以避免的。

我想分享一些我的經(jīng)驗(yàn)，關(guān)于我在過(guò)去兩年的計(jì)算機(jī)視覺(jué)工作中看到或制造的錯(cuò)誤。我在會(huì)議上談到過(guò)這個(gè)話題，很多人在會(huì)后告訴我：“是的，伙計(jì)，我也有很多這樣的錯(cuò)誤。”我希望我的文章可以幫助你至少避免其中的一些問(wèn)題。

1．翻轉(zhuǎn)圖像和關(guān)鍵點(diǎn)

假設(shè)一個(gè)關(guān)鍵點(diǎn)檢測(cè)問(wèn)題的工作。它們的數(shù)據(jù)看起來(lái)像圖像和一系列關(guān)鍵點(diǎn)元組，例如［（0，1），（2，2）］，其中每個(gè)關(guān)鍵點(diǎn)是一對(duì)x和y坐標(biāo)。

讓我們對(duì)這個(gè)數(shù)據(jù)實(shí)現(xiàn)一個(gè)基本的數(shù)據(jù)增強(qiáng)：

def flip＿img＿and＿keypoints（img： np．ndarray， kpts： Sequence［Sequence［int］］）：
img ＝ np．fliplr（img）
h， w，＊＿＝ img．shape
kpts ＝［（y， w － x） for y， x in kpts］
return img， kpts

看起來(lái)好像是正確的，嗯，讓我們把結(jié)果可視化一下：

mage ＝ np．ones（（10， 10）， dtype＝np．float32）
kpts ＝［（0， 1），（2， 2）］
image＿flipped， kpts＿flipped ＝ flip＿img＿and＿keypoints（image， kpts）
img1 ＝ image．copy（）
for y， x in kpts：
img1［y， x］＝ 0
img2 ＝ image＿flipped．copy（）
for y， x in kpts＿flipped：
img2［y， x］＝ 0
＿＝ plt．imshow（np．hstack（（img1， img2）））

不對(duì)稱看起來(lái)很奇怪！如果我們檢查極值的情況呢？

image ＝ np．ones（（10， 10）， dtype＝np．float32）
kpts ＝［（0， 0），（1， 1）］
image＿flipped， kpts＿flipped ＝ flip＿img＿and＿keypoints（image， kpts）
img1 ＝ image．copy（）
for y， x in kpts：
img1［y， x］＝ 0
img2 ＝ image＿flipped．copy（）
for y， x in kpts＿flipped：
img2［y， x］＝ 0

out：

IndexError
Traceback （most recent call last）
＜ipython－input－5－997162463eae＞ in ＜module＞
8 img2 ＝ image＿flipped．copy（）
9 for y， x in kpts＿flipped：
－－－＞ 10 img2［y， x］＝ 0
IndexError： index 10 is out of bounds for axis 1 with size 10

程序報(bào)錯(cuò)了！這是一個(gè)典型的差一誤差。正確的代碼是這樣的：

def flip＿img＿and＿keypoints（img： np．ndarray， kpts： Sequence［Sequence［int］］）：
img ＝ np．fliplr（img）
h， w，＊＿＝ img．shape
kpts ＝［（y， w － x － 1） for y， x in kpts］
return img， kpts

我們可以通過(guò)可視化來(lái)檢測(cè)這個(gè)問(wèn)題，而在x ＝ 0點(diǎn)的單元測(cè)試也會(huì)有幫助。

2．還是關(guān)鍵點(diǎn)問(wèn)題

即使在上述錯(cuò)誤被修復(fù)之后，仍然存在問(wèn)題�，F(xiàn)在更多的是語(yǔ)義上的問(wèn)題，而不僅僅是代碼上的問(wèn)題。

假設(shè)需要增強(qiáng)具有兩只手掌的圖像�？雌饋�(lái)好像沒(méi)問(wèn)題－左右翻轉(zhuǎn)后手還是手。

但是等等！我們對(duì)我們擁有的關(guān)鍵點(diǎn)語(yǔ)義一無(wú)所知。如果這個(gè)關(guān)鍵點(diǎn)的意思是這樣的：

kpts ＝［
（20， 20），＃左小指
（20， 200），＃右小指
．．．
］

這意味著增強(qiáng)實(shí)際上改變了語(yǔ)義：左變成右，右變成左，但我們不交換數(shù)組中的關(guān)鍵點(diǎn)索引。它會(huì)給訓(xùn)練帶來(lái)大量的噪音和更糟糕的度量。

我們應(yīng)該吸取教訓(xùn)：

在應(yīng)用增強(qiáng)或其他特性之前，要了解和考慮數(shù)據(jù)結(jié)構(gòu)和語(yǔ)義；

保持你的實(shí)驗(yàn)原子性：添加一個(gè)小的變化（例如一個(gè)新的變換），如果分?jǐn)?shù)已經(jīng)提高，檢查它如何進(jìn)行和合并。

3．編碼自定義損失函數(shù)

熟悉語(yǔ)義分割問(wèn)題的人可能知道IoU度量。不幸的是，我們不能直接用SGD來(lái)優(yōu)化它，所以常用的方法是用可微損失函數(shù)來(lái)近似它。讓我們編碼實(shí)現(xiàn)一個(gè)！

def iou＿continuous＿loss（y＿pred， y＿true）：
eps ＝ 1e－6
def ＿sum（x）：
return x．sum（－1）．sum（－1）
numerator ＝（＿sum（y＿true ＊ y＿pred）＋ eps）
denominator ＝（＿sum（y＿true ＊＊ 2）＋＿sum（y＿pred ＊＊ 2）
－＿sum（y＿true ＊ y＿pred）＋ eps）
return （numerator ／ denominator）．mean（）

看起來(lái)不錯(cuò)，讓我們測(cè)試一下：

In ［3］： ones ＝ np．ones（（1， 3， 10， 10））
．．．： x1 ＝ iou＿continuous＿loss（ones ＊ 0．01， ones）
．．．： x2 ＝ iou＿continuous＿loss（ones ＊ 0．99， ones）
In ［4］： x1， x2
Out［4］：（0．010099999897990103， 0．9998990001020204）

在x1中，我們計(jì)算了與正確數(shù)據(jù)完全不同的數(shù)據(jù)的損失，而x2則是非常接近正確數(shù)據(jù)的數(shù)據(jù)損失結(jié)果。我們期望x1很大因?yàn)轭A(yù)測(cè)很糟糕，x2應(yīng)該接近0。但是結(jié)果與我期望的有差別，哪里出現(xiàn)錯(cuò)誤了呢？

上面的函數(shù)是度量的一個(gè)很好的近似。度量不是一種損失：它通常（包括這種情況）越高越好。當(dāng)我們使用SGD最小化損失時(shí)，我們應(yīng)該做一些改變：

def iou＿continuous（y＿pred， y＿true）：
eps ＝ 1e－6
def ＿sum（x）：
return x．sum（－1）．sum（－1）
numerator ＝（＿sum（y＿true ＊ y＿pred）＋ eps）
denominator ＝（＿sum（y＿true ＊＊ 2）＋＿sum（y＿pred ＊＊ 2）
－＿sum（y＿true ＊ y＿pred）＋ eps）
return （numerator ／ denominator）．mean（）
def iou＿continuous＿loss（y＿pred， y＿true）：
return 1 － iou＿continuous（y＿pred， y＿true）

這些問(wèn)題可以從兩個(gè)方面來(lái)確定：

編寫一個(gè)單元測(cè)試來(lái)檢查損失的方向

運(yùn)行健全性檢查

4．當(dāng)我們遇到Pytorch的時(shí)候

假設(shè)有一個(gè)預(yù)先訓(xùn)練好的模型。編寫基于ceevee API的Predictor 類。

from ceevee．base import AbstractPredictor
class MySuperPredictor（AbstractPredictor）：
def ＿＿init＿＿（self，
weights＿path： str，
）：
super（）．＿＿init＿＿（）
self．model ＝ self．＿load＿model（weights＿path＝weights＿path）
def process（self， x，＊kw）：
with torch．no＿grad（）：
res ＝ self．model（x）
return res
＠staticmethod
def ＿load＿model（weights＿path）：
model ＝ ModelClass（）
weights ＝ torch．load（weights＿path， map＿location＝＇cpu＇）
model．load＿state＿dict（weights）
return model

這個(gè)代碼正確嗎？也許！對(duì)于某些模型來(lái)說(shuō)確實(shí)是正確的。例如，當(dāng)模型沒(méi)有dropout或norm 層，如torch．nn．BatchNorm2d。

但是對(duì)于大多數(shù)計(jì)算機(jī)視覺(jué)應(yīng)用來(lái)說(shuō)，代碼忽略了一些重要的東西：轉(zhuǎn)換到評(píng)估模式。

如果試圖將動(dòng)態(tài)PyTorch圖轉(zhuǎn)換為靜態(tài)PyTorch圖，這個(gè)問(wèn)題很容易意識(shí)到。torch．jit模塊用于這種轉(zhuǎn)換。

In ［3］： model ＝ nn．Sequential（
．．．： nn．Linear（10， 10），
．．．： nn．Dropout（．5）
．．．：）
．．．：
．．．： traced＿model ＝ torch．jit．trace（model， torch．rand（10））
／Users／Arseny／．pyenv／versions／3．6．6／lib／python3．6／site－packages／torch／jit／＿＿init＿＿．py：914： TracerWarning： Trace had nondeterministic nodes． Did you forget call ．eval（） on your model？ Nodes：
％12 ： Float（10）＝ aten：：dropout（％input，％10，％11）， scope： Sequential／Dropout［1］＃／Users／Arseny／．pyenv／versions／3．6．6／lib／python3．6／site－packages／torch／nn／functional．py：806：0
This may cause errors in trace checking． To disable trace checking， pass check＿trace＝False to torch．jit．trace（）
check＿tolerance，＿force＿outplace， True，＿module＿class）
／Users／Arseny／．pyenv／versions／3．6．6／lib／python3．6／site－packages／torch／jit／＿＿init＿＿．py：914： TracerWarning： Output nr 1． of the traced function does not match the corresponding output of the Python function． Detailed error：
Not within tolerance rtol＝1e－05 atol＝1e－05 at input［5］（0．0 vs． 0．5454154014587402） and 5 other locations （60．00％）
check＿tolerance，＿force＿outplace， True，＿module＿class）

一個(gè)簡(jiǎn)單的解決辦法：

In ［4］： model ＝ nn．Sequential（
．．．： nn．Linear（10， 10），
．．．： nn．Dropout（．5）
．．．：）
．．．：
．．．： traced＿model ＝ torch．jit．trace（model．eval（）， torch．rand（10））
＃沒(méi)有警告！

torch．jit．trace運(yùn)行模型幾次并比較結(jié)果。
然而torch．jit．trace并不是萬(wàn)能的，你應(yīng)該了解并記住。

5．復(fù)制粘貼問(wèn)題

很多東西都是成對(duì)存在的：訓(xùn)練和驗(yàn)證、寬度和高度、緯度和經(jīng)度……如果你仔細(xì)閱讀，你會(huì)很容易發(fā)現(xiàn)一個(gè)bug是由某一個(gè)成員中復(fù)制粘貼到另外一個(gè)成員中引起的：

def make＿dataloaders（train＿cfg， val＿cfg， batch＿size）：
train ＝ Dataset．from＿config（train＿cfg）
val ＝ Dataset．from＿config（val＿cfg）
shared＿params ＝｛＇batch＿size＇： batch＿size，＇shuffle＇： True，＇num＿workers＇： cpu＿count（）｝
train ＝ DataLoader（train，＊＊shared＿params）
val ＝ DataLoader（train，＊＊shared＿params）
return train， val

不僅僅是我犯了愚蠢的錯(cuò)誤，例如。流行的albumentations庫(kù)中也有類似的問(wèn)題。

＃ https：／／github．com／albu／albumentations／blob／0．3．0／albumentations／augmentations／transforms．py
def apply＿to＿keypoint（self， keypoint， crop＿h(yuǎn)eight＝0， crop＿width＝0， h＿start＝0， w＿start＝0， rows＝0， cols＝0，＊＊params）：
keypoint ＝ F．keypoint＿random＿crop（keypoint， crop＿h(yuǎn)eight， crop＿width， h＿start， w＿start， rows， cols）
scale＿x ＝ self．width ／ crop＿h(yuǎn)eight
scale＿y ＝ self．height ／ crop＿h(yuǎn)eight
keypoint ＝ F．keypoint＿scale（keypoint， scale＿x， scale＿y）
return keypoint

不過(guò)別擔(dān)心，現(xiàn)在已經(jīng)修復(fù)好了。

如何避免？盡量以不需要復(fù)制和粘貼的方式編寫代碼。

下面這種編程方式不是一個(gè)好的方式：

datasets ＝［］
data＿a ＝ get＿dataset（MyDataset（config［＇dataset＿a＇］）， config［＇shared＿param＇］， param＿a）
datasets．a(chǎn)ppend（data＿a）
data＿b ＝ get＿dataset（MyDataset（config［＇dataset＿b＇］）， config［＇shared＿param＇］， param＿b）
datasets．a(chǎn)ppend（data＿b）

而下面的方式看起來(lái)好多了：

datasets ＝［］
for name， param in zip（（＇dataset＿a＇，＇dataset＿b＇），
（param＿a， param＿b），
）：
datasets．a(chǎn)ppend（get＿dataset（MyDataset（config［name］）， config［＇shared＿param＇］， param））

6．正確的數(shù)據(jù)類型讓我們編寫一個(gè)新的增強(qiáng)：def add＿noise（img： np．ndarray）－＞ np．ndarray：
mask ＝ np．random．rand（＊img．shape）＋．5
img ＝ img．a(chǎn)stype（＇float32＇）＊ mask
return img．a(chǎn)stype（＇uint8＇）

圖像已被更改。這是我們所期望的嗎？嗯，可能修改得有點(diǎn)過(guò)了。

這里有一個(gè)危險(xiǎn)的操作：將float32轉(zhuǎn)換為uint8。它可能會(huì)導(dǎo)致溢出：

def add＿noise（img： np．ndarray）－＞ np．ndarray：
mask ＝ np．random．rand（＊img．shape）＋．5
img ＝ img．a(chǎn)stype（＇float32＇）＊ mask
return np．clip（img， 0， 255）．a(chǎn)stype（＇uint8＇）
img ＝ add＿noise（cv2．imread（＇two＿h(yuǎn)ands．jpg＇）［：，：，：：－1］）
＿＝ plt．imshow（img）

看起來(lái)好多了，是吧？

順便說(shuō)一句，還有一種方法可以避免這個(gè)問(wèn)題：不要重造輪子，不要從頭開(kāi)始編寫增強(qiáng)代碼，而是使用現(xiàn)有的增強(qiáng)，比如：albumentations．a(chǎn)ugmentations．transforms．GaussNoise。

我曾經(jīng)犯過(guò)另一個(gè)同樣的錯(cuò)誤。

raw＿mask ＝ cv2．imread（＇mask＿small．png＇）
mask ＝ raw＿mask．a(chǎn)stype（＇float32＇）／ 255
mask ＝ cv2．resize（mask，（64， 64）， interpolation＝cv2．INTER＿LINEAR）
mask ＝ cv2．resize（mask，（128， 128）， interpolation＝cv2．INTER＿CUBIC）
mask ＝（mask ＊ 255）．a(chǎn)stype（＇uint8＇）
＿＝ plt．imshow（np．hstack（（raw＿mask， mask）））

這里出了什么問(wèn)題？首先，用三次樣條插值調(diào)整mask的大小是一個(gè)壞主意。與轉(zhuǎn)換float32到uint8的問(wèn)題是一樣的：三次樣條插值的輸出值會(huì)大于輸入值，會(huì)導(dǎo)致溢出。

我在做可視化的時(shí)候發(fā)現(xiàn)了這個(gè)問(wèn)題。在你的訓(xùn)練循環(huán)中到處使用斷言也是一個(gè)好主意。

7．拼寫錯(cuò)誤發(fā)生

假設(shè)需要對(duì)全卷積網(wǎng)絡(luò)（如語(yǔ)義分割問(wèn)題）和一個(gè)巨大的圖像進(jìn)行推理。該圖像是如此巨大，沒(méi)有機(jī)會(huì)把它放在你的GPU上－例如，它可以是一個(gè)醫(yī)療或衛(wèi)星圖像。

在這種情況下，可以將圖像分割成網(wǎng)格，獨(dú)立地對(duì)每一塊進(jìn)行推理，最后合并。此外，一些預(yù)測(cè)交叉可能有助于平滑邊緣的偽影

讓我們編碼實(shí)現(xiàn)吧！

from tqdm import tqdm
class GridPredictor：
＂＂＂
你有GPU內(nèi)存限制時(shí)，此類可用于預(yù)測(cè)大圖像的分割掩碼
＂＂＂
def ＿＿init＿＿（self， predictor： AbstractPredictor， size： int， stride： Optional［int］＝ None）：
self．predictor ＝ predictor
self．size ＝ size
self．stride ＝ stride if stride is not None else size ／／ 2
def ＿＿call＿＿（self， x： np．ndarray）：
h， w，＿＝ x．shape
mask ＝ np．zeros（（h， w， 1）， dtype＝＇float32＇）
weights ＝ mask．copy（）
for i in tqdm（range（0， h － 1， self．stride））：
for j in range（0， w － 1， self．stride）：
a， b， c， d ＝ i， min（h， i ＋ self．size）， j， min（w， j ＋ self．size）
patch ＝ x［a：b， c：d，：］
mask［a：b， c：d，：］＋＝ np．expand＿dims（self．predictor（patch），－1）
weights［a：b， c：d，：］＝ 1
return mask ／ weights

有一個(gè)符號(hào)輸入錯(cuò)誤，可以很容易地找到它，檢查代碼是否正確：

class Model（nn．Module）：
def forward（self， x）：
return x．mean（axis＝－1）
model ＝ Model（）
grid＿predictor ＝ GridPredictor（model， size＝128， stride＝64）
simple＿pred ＝ np．expand＿dims（model（img），－1）
grid＿pred ＝ grid＿predictor（img）
np．testing．a(chǎn)ssert＿allclose（simple＿pred， grid＿pred， atol＝．001）

AssertionError Traceback （most recent call last）
＜ipython－input－24－a72034c717e9＞ in ＜module＞
9 grid＿pred ＝ grid＿predictor（img）
10
－－－＞ 11 np．testing．a(chǎn)ssert＿allclose（simple＿pred， grid＿pred， atol＝．001）
～／．pyenv／versions／3．6．6／lib／python3．6／site－packages／numpy／testing／＿private／utils．py in assert＿allclose（actual， desired， rtol， atol， equal＿nan， err＿msg， verbose）
1513 header ＝＇Not equal to tolerance rtol＝％g， atol＝％g＇％（rtol， atol）
1514 assert＿array＿compare（compare， actual， desired， err＿msg＝str（err＿msg），
－＞ 1515 verbose＝verbose， header＝header， equal＿nan＝equal＿nan）
1516
1517
～／．pyenv／versions／3．6．6／lib／python3．6／site－packages／numpy／testing／＿private／utils．py in assert＿array＿compare（comparison， x， y， err＿msg， verbose， header， precision， equal＿nan， equal＿inf）
839 verbose＝verbose， header＝header，
840 names＝（＇x＇，＇y＇）， precision＝precision）
－－＞ 841 raise AssertionError（msg）
842 except ValueError：
843 import traceback
AssertionError：
Not equal to tolerance rtol＝1e－07， atol＝0．001
Mismatch： 99．6％
Max absolute difference： 765．
Max relative difference： 0．75000001
x： array（［［［215．333333］，
［192．666667］，
［250．］，．．．
y： array（［［［ 215．33333］，
［ 192．66667］，
［ 250．］，．．．

call方法的正確版本如下：

def ＿＿call＿＿（self， x： np．ndarray）：
h， w，＿＝ x．shape
mask ＝ np．zeros（（h， w， 1）， dtype＝＇float32＇）
weights ＝ mask．copy（）
for i in tqdm（range（0， h － 1， self．stride））：
for j in range（0， w － 1， self．stride）：
a， b， c， d ＝ i， min（h， i ＋ self．size）， j， min（w， j ＋ self．size）
patch ＝ x［a：b， c：d，：］
mask［a：b， c：d，：］＋＝ np．expand＿dims（self．predictor（patch），－1）
weights［a：b， c：d，：］＋＝ 1
return mask ／ weights

如果你仍然不知道問(wèn)題是什么，注意行weights［a：b， c：d，：］＋＝ 1。

8．Imagenet歸一化

當(dāng)一個(gè)人需要做遷移學(xué)習(xí)時(shí)，用訓(xùn)練Imagenet時(shí)的方法將圖像歸一化通常是一個(gè)好主意。

讓我們使用熟悉的albumentations來(lái)實(shí)現(xiàn)：

from albumentations import Normalize
norm ＝ Normalize（）
img ＝ cv2．imread（＇img＿small．jpg＇）
mask ＝ cv2．imread（＇mask＿small．png＇， cv2．IMREAD＿GRAYSCALE）
mask ＝ np．expand＿dims（mask，－1）＃ shape （64， 64）－＞ shape （64， 64， 1）
normed ＝ norm（image＝img， mask＝mask）
img， mask ＝［normed［x］ for x in ［＇image＇，＇mask＇］］
def img＿to＿batch（x）：
x ＝ np．transpose（x，（2， 0， 1））．a(chǎn)stype（＇float32＇）
return torch．from＿numpy（np．expand＿dims（x， 0））
img， mask ＝ map（img＿to＿batch，（img， mask））
criterion ＝ F．binary＿cross＿entropy

現(xiàn)在是時(shí)候訓(xùn)練一個(gè)網(wǎng)絡(luò)并對(duì)單個(gè)圖像進(jìn)行擬合——正如我所提到的，這是一種很好的調(diào)試技術(shù)：

model＿a ＝ UNet（3， 1）
optimizer ＝ torch．optim．Adam（model＿a．parameters（）， lr＝1e－3）
losses ＝［］
for t in tqdm（range（20））：
loss ＝ criterion（model＿a（img）， mask）
losses．a(chǎn)ppend（loss．item（））
optimizer．zero＿grad（）
loss．backward（）
optimizer．step（）
＿＝ plt．plot（losses）

曲率看起來(lái)很好，但是－300不是我們期望的交叉熵的損失值。是什么問(wèn)題？

歸一化處理圖像效果很好，但掩碼需要縮放到［0，1］之間。

model＿b ＝ UNet（3， 1）
optimizer ＝ torch．optim．Adam（model＿b．parameters（）， lr＝1e－3）
losses ＝［］
for t in tqdm（range（20））：
loss ＝ criterion（model＿b（img）， mask ／ 255．）
losses．a(chǎn)ppend（loss．item（））
optimizer．zero＿grad（）
loss．backward（）
optimizer．step（）
＿＝ plt．plot（losses）

在訓(xùn)練循環(huán)時(shí)一個(gè)簡(jiǎn)單運(yùn)行斷言（例如assert mask．max（）＜＝ 1）可以很快地檢測(cè)到問(wèn)題。同樣，也可以是單元測(cè)試。