Tensorflow in Practice 1, Computer Vision

EUJU 2020. 6. 21. 22:43

2020. 6. 21. 22:43

기존의 프로그래밍 방식

Rules, Data -> "Programming" -> Answers

머신러닝

Answers, Data -> "Machine Learning" -> Rules

Labeling함으로써 데이터를 학습

#neural net

a neural network is basically a set of functions which can learn patterns.

successive layers are defined in sequence.

뉴럴넷에는 x와 y사이의 관계가 없다.

import tensorflow as tf
import keras
import numpy as np

model = keras.Sequential([keras.layers.Dense(units = 1, input_shape = [1])]) #1개의 뉴런(one dense), one layer
model.compile(optimizer = 'sgd', loss ='mean_squared_error')

xs = np.array([-1.0, 0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)
ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype =float)

model.fit(xs, ys, epochs = 500) #training loop

print(model.predict([10.0]))

rule: 2x - 1 = y

결과값: [[18.981585]]

19가 명확하게 안나오는 이유

1. 너무 적은 데이터로 학습시킴( 6개의 값)

2. probability로 판단하니까

epoch을 거듭하면서 점점 정확해짐

-> epoch은 loss function을 사용해서 추측을 잘 했는지 못했는지 판단, optimizer를 이용해 다른 추측값을 줌

loss가 줄어든다 = x->y의 관계에 근접해감을 의미

- 컴퓨터 비전은 rules-based programming으로 해결하기 어렵다. -> 딥러닝을 써보자

컴퓨터가 사진을 보고 인식(recognize)하게 해준다. -> 딥러닝으로

Computer Vision is the field of having a computer understnad and label what is present in an image.

Fashion MNIST

https://github.com/zalandoresearch/fashion-mnist

zalandoresearch/fashion-mnist

A MNIST-like fashion product database. Benchmark :point_right: - zalandoresearch/fashion-mnist

github.com

10 different items, 28 * 28 pixels, grayscale, 70k images,

-> 784bytes

Machine Learning depends on having good data to train a system with.

train: use some of my data (60,000이미지 사용)

test: similar data that the model hasn't yet seen to test how good it is at recognizing the images(10,000이미지 사용)

Label을 번호로 사용하는 이유(ex. 앵클부츠라는 단어 대신 숫자 9로 분류하는 이유)

1. 컴퓨터가 texts보다 숫자를 더 잘 사용함

2. bias를 줄이는데에 도움 -> 숫자는 어느 나라든 사용하니까

(recent bias, latent bias, selection bias) ?

이미지 픽셀값들을 255로 나눠서 0~1 범위를 갖게 한다.

Flatten: Remember earlier where our images were a square, when you printed them out? Flatten just takes that square and turns it into a 1 dimensional set.

Dense: Adds a layer of neurons

Each layer of neurons need an activation function to tell them what to do. There's lots of options, but just use these for now.

Relu effectively means "If X>0 return X, else return 0" -- so what it does it it only passes values 0 or greater to the next layer in the network.

Softmax takes a set of values, and effectively picks the biggest one, so, for example, if the output of the last layer looks like [0.1, 0.1, 0.05, 0.1, 9.5, 0.1, 0.05, 0.05, 0.05], it saves you from fishing through it looking for the biggest value, and turns it into [0,0,0,0,1,0,0,0,0] -- The goal is to save a lot of coding!

정확도 올리기

1. hidden layer갯수 증가

2. epoch 증가

내가 원하는 지점에 도달했을 때 학습을 중지하는 방법

: model.fit()의 파라미터에 callback 설정

class myCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs ={}):
        if(logs.get('acc') >=0.99):
            print("Reached 99% accuracy so cancelling training!")
            self.model.stop_training = True
            
callback = myCallback()

#Convolutional Newral Networks

pooling is a way of compressing an image

tf.keras.layers.Conv2D(
    filters, kernel_size, strides=(1, 1), padding='valid', data_format=None,
    dilation_rate=(1, 1), activation=None, use_bias=True,
    kernel_initializer='glorot_uniform', bias_initializer='zeros',
    kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None,
    kernel_constraint=None, bias_constraint=None, **kwargs
)

https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D

tf.keras.layers.Conv2D | TensorFlow Core v2.2.0

See Stable See Nightly 2D convolution layer (e.g. spatial convolution over images). View aliases Main aliases tf.keras.layers.Convolution2D Compat aliases for migration See Migration guide for more details. tf.compat.v1.keras.layers.Conv2D, tf.compat.v1.ke

www.tensorflow.org

tf.keras.layers.MaxPool2D(
    pool_size=(2, 2), strides=None, padding='valid', data_format=None, **kwargs
)

https://www.tensorflow.org/api_docs/python/tf/keras/layers/MaxPool2D

tf.keras.layers.MaxPool2D | TensorFlow Core v2.2.0

Max pooling operation for 2D spatial data. View aliases Main aliases tf.keras.layers.MaxPooling2D Compat aliases for migration See Migration guide for more details. tf.compat.v1.keras.layers.MaxPool2D, tf.compat.v1.keras.layers.MaxPooling2D tf.keras.layers

www.tensorflow.org

https://www.youtube.com/playlist?list=PLkDaE6sCZn6Gl29AoE31iwdVwSG-KnDzF

Convolutional Neural Networks (Course 4 of the Deep Learning Specialization) - YouTube

www.youtube.com

1	1	8	4	7	9
5	7	9	1	7	9
5	7	8	1	7	8
4	5	4	1	7	8

- 필터 전달 -> 정보 양 줄이고 한 클래스와 다른 클래스를 구별할 수 있는 특징을 효과적으로 추출

- 풀링 -> 정보를 압축하여 관리하기 쉽게 만드는 방법

==> 이미지 인식 성능을 향상시키는 좋은 방법

https://github.com/lmoroney/dlaicourse/blob/master/Course%201%20-%20Part%206%20-%20Lesson%202%20-%20Notebook.ipynb

lmoroney/dlaicourse

Notebooks for learning deep learning. Contribute to lmoroney/dlaicourse development by creating an account on GitHub.

github.com

https://github.com/lmoroney/dlaicourse/blob/master/Course%201%20-%20Part%206%20-%20Lesson%203%20-%20Notebook.ipynb

lmoroney/dlaicourse

Notebooks for learning deep learning. Contribute to lmoroney/dlaicourse development by creating an account on GitHub.

github.com

#4주차

https://www.youtube.com/watch?v=NlpS-DhayQA

#Image Generator

특징: 디렉토리를 가리키고 하위 디렉토리 자동으로 레이블 생성

하위 디렉토리를 기반으로 파일을 자동으로로드하고 레이블을 지정하는 신경망

Image Directory의 하위 Directory, Training & validation

각 하위 Directory 에 horse와 human이미지 저장 -> Image Generator가 해당 이미지에 대한 Feeder를 만들고 알아서 Label을 만든다.

(Image Generator가 Training Directory를 가리키면, label은 horse와 human이 되고, 모든 이미지는 라벨링 되어 불러와짐)

Data normalize를 위해 rescale

Directory에서 flow method를 호출
-> 해당 디렉토리 및 서브 디렉토리의 이미지 로드
(서브디렉토리에서 부르면 안 된다!)

target_size: 뉴럴넷의 input size는 항상 동일하므로 일관적으로 같은 크기로 resize
(load할 때 resize하므로 전처리 필요없음)

batch_size: 성능 영향, 바꿔가며 확인

서브디렉토리 명은 이미지의 라벨명

convolution 3번
-> higher complexity and size of the image

298 * 298 `~> 35*35

300*300 image, rgb channel 3

sigmoid는 binary classfication에 아주 좋음, 하나는 0으로, 다른 하나는 1로

softmax를 이용해 2개의 뉴런을 아웃풋으로 설정할 수 있다.

convolution거친 후 flatten에 들어가는 뉴런 갯수
== 78400(35 * 35 * 64)

만약 convolution이 없었다면
== 900,000(300*300)

-> 데이터를 잔뜩 줄일 수 있음

- model.compile

binary choice를 하기 때문에 binary_crossentropy로

RMSprop

- training, model.fit_generator

datatset 대신에 generator를 쓰기 때문에 model.fit이 아닌 model.fit_generator

traing_generator: 일전에 설정했던 training generator
- training directory에서 이미지들을 스트리밍한다.
=================

1024 이미지를 128씩 불러옴 -> 모든 이미지를 불러오려면 8번의 batch
--> steps_for_epoch = 8
==================
validation_generator
-> 256이미지, 32씩 불러옴
-> validation_steps= 8
==================
verbose parameter
-> 학습이 진행될 동안 표시할 양 지정

# Binary Crossentropy

https://gombru.github.io/2018/05/23/cross_entropy_loss/

Understanding Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss, Softmax Loss, Logistic Loss, Focal Loss and all those c

People like to use cool names which are often confusing. When I started playing with CNN beyond single label classification, I got confused with the different names and formulations people write in their papers, and even with the loss layer names of the de

gombru.github.io

#RMSprop

http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

https://www.youtube.com/watch?v=eqEc66RFY0I&t=6s

# Binary Classification

logistic Regression

1 - cat, 0 - non cat

픽셀 값들을 Feature Vector로 바꾸려면 모든 픽셀값을 입력될 Feature Vector X의 1열로 나열

픽셀값을 하나의 벡터로 펼친다.

ex) 64*64 컬러 이미지를 펼치면, 그 크기는 64*64*3의 크기를 가짐

(x, y) ->x는 feature vector, y는 label

X = [x1 x2 x3 ..... xm] (m은 training sample 개수)

x = (n_x, m)

y = (1, m)

- Horses or Humans Convnet

https://github.com/lmoroney/dlaicourse/blob/master/Course%201%20-%20Part%208%20-%20Lesson%202%20-%20Notebook.ipynb

lmoroney/dlaicourse

Notebooks for learning deep learning. Contribute to lmoroney/dlaicourse development by creating an account on GitHub.

github.com

- Horses or Humans with Validation

https://github.com/lmoroney/dlaicourse/blob/master/Course%201%20-%20Part%208%20-%20Lesson%203%20-%20Notebook.ipynb

lmoroney/dlaicourse

Notebooks for learning deep learning. Contribute to lmoroney/dlaicourse development by creating an account on GitHub.

github.com

- Horses or Humans With Compacting of Images

https://github.com/lmoroney/dlaicourse/blob/master/Course%201%20-%20Part%208%20-%20Lesson%204%20

-%20Notebook.ipynb

lmoroney/dlaicourse

Notebooks for learning deep learning. Contribute to lmoroney/dlaicourse development by creating an account on GitHub.

github.com

#Image Augmentation

: over fitting을 피하는데 좋음

- 제한된 데이터 -> 결과 또한 제한적

- rotation과 같은 변형을 이용해 학습(ex. 누워있는 고양이와 서있는 고양이 이미지를 동시에 얻을 수 있음)

https://github.com/keras-team/keras-preprocessing

keras-team/keras-preprocessing

Utilities for working with image data, text data, and sequence data. - keras-team/keras-preprocessing

github.com

https://keras.io/api/preprocessing/image/

Keras documentation: Image data preprocessing

Image data preprocessing image_dataset_from_directory function tf.keras.preprocessing.image_dataset_from_directory( directory, labels="inferred", label_mode="int", class_names=None, color_mode="rgb", batch_size=32, image_size=(256, 256), shuffle=True, seed

keras.io

rescale - 이미지 로드 시에 크기 변환

rotation_range - 0~value사이의 random값으로 회전

width(height)_shift_range
- 대부분의 이미지는 피사체가 중간에 잇음
-

shearing - 기울기 값

zoom - 전신샷 -> 상반신샷으로 바꿀 수 있음
- ~value값 까지 랜덤하게 결정됨

horizontal_flip - 좌우반전(랜덤하게 반전됨)

fill_mode - 손실된 픽셀 채우기

- Image Augmentation이 오버피팅을 피하는 확실한 방법은 아니다.

#Transfer Learning

https://www.tensorflow.org/tutorials/images/transfer_learning

Transfer learning with a pretrained ConvNet | TensorFlow Core

In this tutorial, you will learn how to classify images of cats and dogs by using transfer learning from a pre-trained network. A pre-trained model is a saved network that was previously trained on a large dataset, typically on a large-scale image-classifi

www.tensorflow.org

lock(freeze):

You saw Transfer Learning, and how you can take an existing model, freeze many of its layers to prevent them being retrained, and effectively 'remember' the convolutions it was trained on to fit images.

freeze를 시켜 image augmentation을 할 수 있게

모든 레이어는 이름을 갖고 있다.

last_output이라는 이름의 모델
(mixed7에서 출력을 가져온 모델)

<Drop out>

- 무작위로 신경망의 유닛을 삭제시키는 기법

신경망의 레이어가 때때로 유사한 가중치를 가지고 서로에게 영향을 주어 과적 합을 초래할 수 있다

왜 regulizer로 사용?

: 모든 반복마다 더 작은 신경망에서 작업

-> 정규화의 효과를 주는 듯

단일 유닛- > 입력을 받아 의미있는 출력을 생성해야 함

-> drop out을 통해 입력 유닛이 무작위로 삭제

-> 해당유닛은 어떠한 특징에 의존할 수 없음 -> 계속 바뀌니까

-> 특정 입력에 큰 가중치를 부여x, 입력들에 가중치를 분산시키는 방향으로 학습됨

-> shirink weights

keep_prob(각 layer에 해당 유닛을 유지할 확률) -> 각 레이어마다 바꿀 수 잇음

-> 과적합의 우려가 적은 층에서는 높게 설정해도 괜찮음

-> 과적합의 우려가 큰 층에서는 낮게 설정한다.

CV는 대부분 과적합이 많이 발생하기에 대부분의 경우 drop out을 쓴다.

단점) cost tunction이 잘 정의 x

-> GD로 확인해도 J 가 하강하는지 확인하기 어렵다.

-> 디버깅이 어려워짐

-> keep_prob을 1로 설정한 후 하강하는지 확인한 다음에 drop out

drop out
- 0~1 설정
해당 %만큼 drop out


drop out 적용x	drop out 적용

1. pre trained 네트워크에 대한 가중치를 사용해 새 인스턴스를 인스턴스화

2. convolutional layer 중 하나를 입력 레이어로 가져온뒤 만든 output = last output

3. last output을 입력으로 사용하여 모델을 설정(flatten,dense,drop out), 이미지 생성기 설정

#Multi Class classification

가위바위보 분류

http://www.laurencemoroney.com/rock-paper-scissors-dataset/

Rock Paper Scissors Dataset

Introducing Rock Paper Scissors – A multi class learning dataset Abstract Rock Paper Scissors is a dataset containing 2,892 images of diverse hands in Rock/Paper/Scissors poses. It is license…

www.laurencemoroney.com

softmax는 모든 경우의 확률을 고려하여 결과값을 내놓는다.

알파벳순으로 결과를 내놓음

출처: Coursera, Tensorflow in Practice 특화과정

저작자표시 (새창열림)

'공부일지 > Computer Vision & Image Processing' 카테고리의 다른 글

GAN (0)	2020.06.02
OpenCV, visual studio에서 시작 (0)	2020.05.28
SIFT 절차 (0)	2020.01.28
scale (0)	2020.01.28
OpenCV 정리하기(1)_Finger Count (0)	2019.10.16

우주의WouldYou

Tensorflow in Practice 1, Computer Vision

'공부일지 > Computer Vision & Image Processing' 카테고리의 다른 글

+ Recent posts

티스토리툴바