페이지

2022년 5월 28일 토요일

2.1 Neuraon Model

 An adult brain contains about 100 billion neuraons. Each neuraon obtains input signals through dendrites and transmits output signals through axons. The neurons are interconnected to form a huge neural network, thus forming the human brain, the basis of perception and consciousness. Figure 2-1 is a typical biological neuron structure. In 1943, the psychologist Warren McCulloch and mathematical logician Walter Pitts proposed a mathematical model of artificial neural networks to simulate the mechanism of biological neuraons. This research was further developed by the American neurologist Frank Rosenblatt into the perceptron model, which is also the cornerstone of modern deep learning.

Starting from the structure of biological neurons, we will revisit the exploration of scientific pioneers and gradually unveil the mystery of automatic learning machines.

First, we can abstract the neuron model into the mathematical structure as shown in Figure 2-2. The neuron input vector x = [x1, x2, x3,...xn]T maps to y through function f:x->y, where θ represents the parameters in the function f. Consider a simplified case, such as linear transformation: f(x) = wtx + b. The expanded form is

f(x) = w1x1 + w2x2 +.... +wnxn +b

The preceding calculation logic can be intuitively shown in Figure 2-2.

The parameters θ = {w1, w2, w3,...,wn,b} determine the state of the neuron, and the processing logic of this neuron can be determined by fixing those parameters. When the number of input nodes n = 1 (single input), the neuron model can be further simplified as 

y = ws +b

 Then we can plot the change of y as a function of x as shown in Figure 2-3. As the input signal x increases, the output also increases linearly. Here parameter w can be understood as the slope of the straight line, and b is the bias of the straight line.

For a certain neuron, the mapping relationship f between x and y is unknown but fixed. Two pints can determine a straight line. In order to estimate the value of w and b, we only need to sample any two data points(x(1), y(1)) and (x(2), y(2)) from the straight line in Figure 2-3, where the superscript indicates the data point number:

y(1) = wx(1) +b

y(2) = wx(2) + b

If(x(1), y(1))  (x(2), y(2)), we can solve the preceding equations to get the value of w and b. Let's consider a specific example: x(1) = 1, y(1) = 1.567, x(2) = 2, y(2) = 3.043. Substituting the numbers in the preceding formulas gives

1.567 = w.1 + b

3.043 = w.2 + b

This is the system of binary linear equations that we learned in junior or high school. The analytical solution can be easily calculated using the elimination method, that is, w = 1.477, b=0.089.

You can see that we only need two different data points to perfectly solve the parameters of a single-input lineary neuron model. For linear neuron models with N input, we only need to sample N + 1 different data points. It seems thjat the linear neuron models can be perfectly resolved. So what's wrong with the preceding method? Considering that there may be observation errors for any sampling point, we assume that the observation error variable e follows a normal distribution N(μσ2) with μ as mean and σ2 as variance. Then the samples follow:

y = wx + b + e, e - N(μσ2)

Once the observation error is introduced, event if it is as simple as a linear model, if only two data  ppoints are smapled, it may bring a large estimation bias. As shown in Figure 2-4, the data points all have observation errors. IF the estimatino is based on the two blue rectangular data points, the estimatied blue dotted line woould have a large deviation from the true orange straight line. In order to reduce the estimation bias introduced by observation errors, we can sample multiple data points D = {(x(1), y(1)), (x(2),y(2)), (x(3),y(3))...,(x(n),y(n))} and then find a "best" straight line, so that it minimizes the sum of errors between all sampling points and the straight line.

Due to the existence of observation errors, there may not be a straight line that perfectly passes through all the sampling points D. Therefore, we hope to find a "good" straight line close to all sampling points. How to measure "good" and "bad"? A natural idea is to use the mean squared error (MSE) between the predicted vaslue wx(i) + b and the true value y(i) at all sampling points as the total error, that is

Then search a set of parameters w and b to minimize the total error L. The straight line corresponding to the minimal total error is the optimal straight line we are looking for, that is

Here n represents the number of sampling points.


2022년 5월 24일 화요일

CHATER 2 Regression

 Some people worry that artificaial intelligence will make us feel inferior, but then, anybody in his right mind should have an inferiority complex every time he looks at a flower. -Alan Kay

1.6.4 Common Editor Installation

 There are many ways to write programs in Python. You can use IPython or Jupyter Notebook to write code interactively. You can also use Sublime Text, PyCharm, and VS Code to develop medium and large projects. This book recommends using PyCharm to write and debug code and using VSCode for interactive project development. Both of them are free. Users can download and install them by themselves.

Next, let's start the deep learning journey!

1.6.3 TensorFlow Installation

 TensorFlow, like other Python libraries, can be installed using the Python package management tool "pip install" command. When installing TensorFlow, you need to determine whether to install a more powerful GPU version or a general-performance CPU version based on whether your omputer has an NVIDA GPU graphics card

# Install numpy

pip install numpy

With the preceding command, you should be able to automatically download and install the numpy library. Now let's install the latest GPU verison of TensorFlow. The command is as follows:

# Install TensorFlow GPU version

pip install -U tensorflow

The preceding command should automatically download and install the TensorFlow GPU version, which is currently the official version of TensorFlow 2.x. The "-U" parameter secifies that if this package is installed, the upgrade command is executed.

Now let's test whether the GPU version of TensorFlow is successfully installed. Enter "ipython" on the "cmd" command line to enter the ipython interactive terminal, and thenm enter the "import tensorflow as tf" command. If no errors occur, continue to enter "tf.test.is_gpu_available()" to test whether the GPU is available. This command will print a series of information. The information beginning with "I"(Information) contains information about the available GPU graphics devices and will return "True" or "False" at the end, indicating whether the GPU device is available, as shown in Figure 1-35. If True, the TensorFlow GPU version is successfully installed; if False, the installation fails.

You may need to check the steps of CUDA, cuDNN, and environment variable configuration again or copy the error and seek help from the search engine.

If you don't have GPU, you can install the CPU version. The CPU version cannot use the GPU to accelerate calculations, and the conputational seed is relatively slow. However, because the models introduced as learning purposes in this book are generally not omputationally expensive, the CPU version can also be used. If it also possible to add the NVIDA GPU device after having better understanding of deep learning in the future. If the installation of the TensorFlow GPU version fails, we can also use the CPU version directly. The command to install the CPU version is

# Install TensorFlow CPU version

pip install -U tensorflow-cpu

After installation, enter the "import tensorflow as tf" command in the ipython terminal to verify that the CPU version is successfully installed. Afeter TensorFlow is installed, you can view the version number through "tf._version_". Figure 1-36 shows an example. Note that even the code works for all TensorFlow 2.x versions.

The preceding manaual process of installing CUDA and cuDNN, configuring the Path environment variable, and installing TensorFlow is the standard installation method. Although the steps are tedious, it is of great help to understand the functional role of each library. In fact, for the novice, you can complete the preceding steps by two commands as follows:

# Create virtual environment tf2 with tensorflow-gpu setup required

# to automatically install CUDA, cuDNN, and TensorFlow GPU

conda create -n tf2 tensorflow-gpu

#Activate tf2 environment

conda activate tf2

This quick installation method is called the minimal installation method. This is also the convenience of using the Anaconda distribution.

TensorFlow installed though the minimal version requires activation of the corresponding vertual environment before use, which needs to be distinguished from the standard version. The standard version is installed in Anaconda's default environment base and generally does not require manual activation of the base environment.

Common Python libraries can also be installed by default. The command is as follows:

# Install common python libraries

pip install -U ipython numpy matplotilib pillow pandas

When TensorFlow is running, it will consume all GPU resources by default, which is very computationally unfiendly, especially when the computer has multiple users or programs using GPU resources at the same time. Occuping all GOU resources will make other programs unable to run. Therefore, it is generally recommeded to set the GPU memory usage of TensorFlow to the growth mode, that is, to apply for GPU memory resources based on the actual model size. The code implementation is as follows:

# Set GPU resource usage method

# Get GPU device list

gpus = tf.config.experimental.list_physical_devices('GPU')

if gpus:

    try:

        # Set GPU usage to growth mode

        for gpu in gpus:

            tf.config.experimental.set_memory_growth(gpu, True)

    except RuntimeError as e:

        # print error

        print(e)

2022년 5월 23일 월요일

1.6.2 CUDA Installation

 Most of the current deep learning frameworks are based on NVIDIA's GPU graphics card for accelerated calculations, so you need to install the GPU acceleration library CUDA provided by NVIDIA. Before installing CUDA, make suer your computer has an NVIDIA graphics device that supports the CUDA program. If you computer does not have an NVIDIA grahics card-for example, some computer graphics card manufactures are AMD or Intel - the CUDA program won't work, and you can skip this step and directly install the TensorFlow CPU version.

The installation of CUDA is divided into three steps: CUDA software installation, cuDNN deep neural network acceleration library installation, and environment variable configuration. The installation process is a bit tedious. We will go through them step by step using the Windows 10 system as an example.

CUDA Sotfware Installation Open the official downloading website of the CUDA program: https://developer.nvidia.com/duca-10.0-download-archive. Here we use CUDA 10.0 version: select thw Windows platform, x86_64 architecture, 10 system, and exe (local) installation package and then select "Download" to download the CUDA installation software. After the download is cimplete, open the software. As shown in Figure 1-25, select the "Custom" option and click the "NEXT" button to enter the installation program selection list as shown in Figure 1-26. Here you can select the components that need to be installed and unselect those that do not need to be installed. Under the "CUDA" category, unselect the "Visual Studio Integration" item. Under the "Driver components" category compare the version number of "Current Version" is greater than "New Version," you need to uncheck the "Display Driver." If "Current Version is less than or equal to "New Version," leave "Display Driver." If Current Version" is less than or equal to "New Version," leave "Display Driver" checked, as shown in Figure 1-27. After the setup is complete, you can click "NEXT" and follow the instructions to install.

After the installation is complete, let's test whether the CUDA software is successfully installed. Open the "cmd" terminal and enter "nvcc-V" to print th ecurrent CUDA version information, as shown in Figure 1-28. If the command is not recongnized, the installation has filed. We can find the "nvcc.exe" program from the CUDA installation path "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\V10.0\bin", as shown in Figure 1-29.

cuDNN Neural Network Acceleration Library Installation. CUDA is not a special GPU acceleration library for neural networks; it is designed for a variety of applications that require paralled computing. If you want to accelerate for neural network applications, you need to install an additional cuDNN library. It should be noted that the cuDNN library is not an executable program. You only need to download and decompress the cuDNN file and configure the Path environment variable.

Open the website https://developer.nvidia.com/cudnn and select "Download cuDNN." Due to NVIDIA regulations, users need to log in or create a new user to continue downloading. After logging in, enter the cuDNN download interface and check "I Agree To the Terms of the cuDNN Software License Agreement," and the cuDNN version download option will pop up, Select the cuDNN version that matches CUDA 10.0, and click the "cuDNN Library for Windows 10" link to download the cuDNN file, as shown in Figure 1-30. It should be noted that cuDNN itself has a version number, and it also needs to match the CUDA version number.

After downloading the cuDNN file, unzip it and rename the folder "cuda" to "cudnn765". Then copy the "cudnn765" folder to the CUDA installation path "C\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0". A dialog box that requires adminstrator rights mayu pop up here. Select continue to paste.

Environment Variable Configuration. We have completed the installation of cuDNN, but in order for the system to be aware of the location of the cuDNN file, we need to configure the Path environment variable as follows. Open the file brower, right-click "My Computer," select "Properties," select "Advanced system settings," and select "Environment Variables," as shown in  Figure 1-32. Select the "Path" environment variable in the "System variables" column and select "Edit," as shown in Figure 1-33. Select "New," enter the cuDNN installation path "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\cudnn756\bin", and use the "Move up" button to move this item to the top.

After the CUDA installation is complete, the environment variables should include "C:\Program File\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin," "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\libnvvp", and "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\cudnn756\bin". The preceding path may differ slightly according to the actual path, as shown in Figure 1-34. After confirmation, click "OK" to close all dialog boxes.


1.6.1 Anaconda installation

 The Pythonb interpreter is the bridge that allows code written in Python to be executed by CPU and is the core software of the Python language. Users candownload the appropriate version(Python 3.7 is used here) of the interpreter form www.python.org/. After the installation is completed, you can call the python.exe program to execute the source code file written in Python(.py files).

Here we choose to install Anaconda software that integrates a series of auxiliary functions such as the Python interpreter, package management, and virtual environment. We can download Andacoda from www.anaconda.com/distribution/#download-section and select the latest version of Python to download and install. As shown in Figure 1-22, check the "Add Anaconda to my PATH environment variable" option, so that you can call the Anacondat program through the command line. As shown in Figure 1-23, the installer asks whether to install the VS code software together. Select Skip. The entire installation process lasts about 5 minutes, and the specific time depends on the computer performance.

After the installation is complete, how can we verify that Anaconda was successfully installed? Pressing the Windows+R key combination on the keyboard, you can bring up the running program dialog box, enter "cmd," and press Enter to open the command-line program "cmd.exe" that comes with Windows. Or click the Start menu and enter "cmd" to find the "cmd.exe" program and open it. Enter the "conda list" command to view the installed libraries in the Python environment. If it is a newly installed Python environment, the listed libraries are all libraries that come with Anaconda, as shown in Figure 1-24. If the "conda list" can pop up a series of library list information normally, the Anaconda software installation is successful. Otherwise, the installation failed, and you need to reinstall.

2022년 5월 21일 토요일

1.6 Development Environment Installation

 After knwing the convenience brought by the deep learning fraework, we are now ready to install the latest version of TensorFlow in the local desktopo. TensorFlow supports a variety of ocmmon operating systems, such as Windows 10, Ubuntu 18.04, and Mac OS.  It supports both GPU version running on NVIDIA GPU and CPU version that uses only the CPU to do calculations. We take the most common operating system, Windows 10, NVIDIA GPU,and Python as examples to introduce how to install the TensorFlow tramework and other development software.

Generally speaking, the development environment installation is divided into four major steps: th ePyuthon iterpreter Anaconda, the CUDA acceleration library, the TensorFlow framework, and commonly used editors.


1.5.3 Demo

 The core of deep learning is the design idea of algorithms, and deep learning frameworks are just our tools for implementing algorithms. In the following, we will demonstrate the three core functions of the TensorFlow deep learning framework to help us understand the role of frameworks in algorithm design.

a) Accelerated Calculation

The neural network is essentially comoposed of a large number of basic mathematical operations such as matrix multiplication and addition. One important function of TensorFlow is to use the GPU to conveniently implement parallel computing acceleration functions. In order to demonstrate the acceleration feffect of GPU, we can compare mean running time for multiple matirx multiplications on CPU and GPU as follows.

We create two matrices A and B with spahe [1,n] and [n,1], separately. The size of the matrices can be adjusted using parameter n. The code is as follows:

# Create two matriees running on CPU

with tf.device('/cpu:0'):

    cpu_a = tf.random.normal([1,n])

    cpu_b =tf.random.normal([n,1])

    print(cpu_a.device, cpu_b.device)

#Create two matrices running on GPU

with tf.device('/gpu:0'):

    gpu_a = tf.random.normal([1,n])

    gpu_b = tf.random.normal([n,1])

    print(gpu_a.device, gpu_b.device)

Let's implement the ufnctions of the CPU and GPU operations and measuer the computation time of the two functions through the imeit. itmeit() function. It should be noted that additional environment initialization work is generally required for the first calculation, so this time cannot be counted. We remove this time through the warm-up session and then measuer the calculation time as follows:

def cpu_run(): # CPU function

    with tf.device('/cpu:0'):

        c = tf.matmul(cpu_a, cpu_b)

    retun c

def gpu_run(): # GPU function

    with tf.device('/gpu:0'):

        c = tf.matmul(gpu_a, gpu_b)

    return c

#First calcualtion needs warm-up

cpu_time = timeit.timeit(cpu_rn, number=10)

gpu_time = timeit.timeit(gpu_run, number=10)

print('warmup:', cpu_time, gputime)

# Calculate and print mean running time

cpu_time = timeit.timeit(cpu_run, number=10)

gpu_time = timeit.timeit(gpu_run, number=10)

print('run time:', cpu_time, gpu_time)

We plot the computation time under CPU and GPU environments at different matrix sizes as shown in Figure 1-21. It can be seen that when the matrix size is small, the CPU and GPU times are almost the same, which does not reflect the advantages of GPUY parallel computing. When the matrix size is larger, the CPU computing time significantly increases, and the GPU takes full advantage of paralled computing without almost any change of computation time.

b) Automatic Gradient Calculation

When using TensorFlow to construct the forward caculation process, in addition to being able to obtain numberical results, TensorFlow also automatically builds a computational graph. TensorFow provides automatic differentiation that can calculate the derivative of the output on network parameters without manual derivation. Consider the expression of the following function:

y  = aw2 + bw +c

The derivative relationship of the output y th the variable w is 

dy/dw = 2aw +b

Consider the derivative at (a,b,c,w) = (1,2,3,4). We cat get dy/dw = 2*1*4 + 2 = 10

With TensorFlow, we can directly calculate the derivative given the expression of a function without manaully deriving the expression of the derivatives. TensorFlow can automatically derive it. The code is implemented as follows:

import tensorflow as tf

# Create 4 tensors

a = tf.constant(1.)

b = tf.constant(2.)

c = tf.constant(3.)

w = tf.constant(4.)

with tf.GradientTape() as tape:# Track derivative tape.watch([w]) # Add w to derivative watch list

# Design the function

y = a * w w**2 _ b * w + c

# Auto derivative calculation

[dy_dw] = tape.gradient(y, [w])

print(dy_dw) # print the derivative

The result of the program is 

tf.Tensor(10.0, shape=(), dtype=float32)

It con be seen that the result of TensorFlow's automatic differentiation is consistent with the result of manual calculation.

c) Common Neural Network Interface

In addition to the underlying mathematical functions such as matrix multiplication and addition, TensorFlow also has a series of convenient functions for deep learning systems such as commonly used neural network operation functions, commonly used network layers, network training, model saving, loading, and deployment. Using TensorFlow, you can easily use thes functions to complete common production processes, which is efficient and stable.


1.5.2 TnsorFlow 2 and 1.x

 TensorFlow 2 is a completely different fraework from TensorFlow 1.x in terms of user experience. TensorFlow 2 is not compatible with TensorFlow 1.x code. At the same time, it is very different in programming style and functional interface design. TensorFlow 1.x code needs to rely on artificial migration, and automated migration methods are not reliable. Google is about to stop updating TensorFlow 1.x. It is not recommended to learn TensorFlow 1.x now.

TensorFlow 2 supports the dynamic graph priority mode. You can obtain both the computational graph and the numerical results during the calculation. You can debug the code and print the data in real  time. The network is built like a building block, stacked layer by layer, which is in line with software development thinking.

Taking simple addition 2.0 + 4.0 as an example, in TensorFlow 1.x, we need to create a calculation graph first as follows:

import tensorflow as tf

# 1. Create computaition graph with tf 1.x

# Create 2 input variables with fixed name and type

a_ph = tf.placeholder(tf.float32, name='variable_a')

b_ph = tf.placeholder(tf.loat32, name='variable_b')

# Create output operation and name

c_op = tf.add(a_ph, b_ph, name='variable_c')

The process of creating acomputational graph is analogous to the process of establishing a formula c=a_b through symbols. It only records the computational steps of the formula and does not actually caculate the numerical results. The numberical results can only be obtained by running the output c and assigning values a = 2.0 and b = 4.0 as follows:

# 2. Run computational graph with tf 1.x

# Create running environment

sess = tf.InteractiveSession()

#Initialization

init = tf.global_variables_initializer()

sess.run(init) # Run the initialization

# Run the computation graph and return value to c_numpy

c_numpy = sess.run(c_op, feed_dict={a_ph: 2., b_ph: 4.})

#print out the output

print('a+b', c_numpy)

It can be seen that it is so tedious to perform simple addition operations in TensorFlow 1, let alone to create complex neural network algorithms. This programming method of creating a computational graph and then running it later is called symbolic programming.

Next, we use TensorFlow 2 to complete the same operation as follows:

import tensorflow as tf

# Use TensorFlow 2 to run

# 1.Create and initalize variable

a = tf.constant (2.)

b = tf.constant(4.)

# 2. Run and get result directly

print('a_b=', a_b)

As you can see, the calculation process is very simple, and there are not extra calcuation steps.

The method of getting both computation graphs and numerical results at the same time is called jimperative programming, also known as dynamic graph mode. TensorFlow 2 and PyTorch are both developed using dynamic graph priority mode, which is easy to debug. In general, the dynamic graph mode is highly efficient for development, but it may not be as efficient as the static graph mode for running. TensorFlow 2 also supports converting the dynamic graph mode to the static graph mode through tf.function, achieving a win-win situation of both development and operating efficiency. In the remaining part of this book, we use TensorFlow to represent TensorFlow 2 in general.


1.5.1 Major Frameworks

 - Theano is one of the earliest deep learning frameworks. It was developd by Yoshua Bengio and Ian Goodfellow. It is a Python-based ocmputing library for positioning low-level operations. Theano supports both GPU and CPU operations. Due to Theano's low development efficiency, long model compilation time, and developers switching to TensorFlow, Theano ahs now stopped maintenace.


- Scikit-learn is a complete computing library for machine learning algorithms. It has builit-in support for common traditional machine learning algorithms, and it ahs rich documentation and examples. However, scikit-learn is not specifically designed for neural networks. It does not support GPU acceleration, and the implementation of neural network-related layers is also lacking.


- Caffe was developed by Jia Yangqing in 2013. It is mainly used for applications using convolutional neural networks and is not suitable for other types of neural networks. Caffe's main development language is C++, and it also provides interfaces for other languages such as Python. It also supports GPU and CPU. Due to the earlier developement time and higher visibility in the industry, in 2017 Facebook launched an upgraded fversion of Caffe, Caffe2. Caffe2 has now been integrated into the PyTorch library.

- Torch is a very good scientfic computing library, developed based on the less popular programming language Lua, Torch is highly flexible, and it is also an excellent gene inherited by PyTorch. However, due to the small number of Lua language users, Torch has been unable to obtain mainstream applications.

- MXNet was developed by Chen Tianqi and Li Mu and is the official deep learning framework of Amazon. It adopts a mixed method of imperative programming and symbolic programming, which has high flexibility fast running speed, and rich documentation and examples.

-PyTorch is a deep learning framework launched by Facebook based on the original Torch framework using Python as the main development language. PyTorch borrowed the design style of Chainer and adopted imperative programming, which made it very convenient to build and debug the network. Although PyTorch was only released in 2017, due to its sophisticated and compact interface design, PyTorch After the 1.0 version, the original PyTorch and Cafrfe2 were merged to make up for PyTorch's deficiencies in industrial deployment. Overall, PyTorch is an excellent deeop learning framework.

- Keras is a high-level framework implemented based on the underlying operations provided by frameworks such as Theano and TensorFlow. It provides a large number of high-level interfaces for rapid training and testing. For common applications, developing with Keras is very efficient. But because there is no low-level implementation, the underlyuing framework needs to be abstracted, so the operation efficiency isnot high, and the flexibility is average.

- TensorFlow is a deep learning framework released by Google in 2015. The initial version only supported symbolic programming. Thanks to its earlier release and Google's influence in the field of deep learning, TensorFlow quickly became the most popular deep learning framework. However, due to frequent changes in the interface design, redundant functional design, and difficulty in symbolic programming development and debuygging, TensorFlow 1.x was once criticized by the industry. In 2019, Google launched the  official version of TensorFlow 2, which runs in dynamic graph priority mode and can avoid many defects of the TensorFlow 1.x version. TensorFlow 2 has been widely recognized by the industry.


At present, TensorFlow and PyTorch are the two most widley used deep learning frameworks in industry. TensoirFlow has a complete solution and user base in the industry. Thanks to its streamlined and flexible interface design, PyTorch can quickly build and debug entworks, which has received ravee reviews in academia. After TensorFlow 2 was released, it makes it easier for users to learn TensorFlow and seamlessly deploy moduels to production. This book users TensorFlow2 as the main framework to implement deep learning algorithms.

Here are the connections and differences between TensorFlow and Keras.  Keras can be understood as a set of high-level API design specifications. Keras itself has an official implementaion fo thespecifications. The same specifications are also implemented in TensorFlwo, which is called the tf.keras module, and tf.keras will be used as the unique high-level interface to avoid interface redundancy,. Unless otherwise specified, Keras in this book refers to tf.keras.





1.4 DEEP LEARNING APPLICATIONS

 An introduced earlier, there is an excess of scenarios and applications where Deep Leaning is being used. Let us look at few applications in Deep Learning for a more profound understanding of where exactly DL is applied

1.3 WHAT IS THE NEED OF A TRANSTITION FROM MACHINE LEARNING TO DEEP LEARNING?

 Machine Learning has been around for a very long time. Machine Learning helped and motivated scientists and researchers to come up with newer algorithms to meet the expectations of technology enthusiasts. The major limitation of Machine Learning lies in the explicit human intervention for the extraction of features in the data that we work (Figure 1.1). Deep Learning allows for automated feature extraction and learning of the model adapting all by itself to the dynamism of data.

Apple => Menual feature extraction => Learning => Machine learning => Apple

Limitation fo Machine Learning.

Apple => Automatic feature extraction and learning => Deep learning => Apple

Advantages of Deep Learning.

Deep Learning very closely tries to imitate the structure and pattern of biological neurons. This single concept, which makes it more complex, still helps to come out with effective predictions. Human intelligence is supposed to be the best of all types of intelligence in the universe. Researchers are still striving to understand the  complexity of how the Human intelligence is supposed to be the best of all types of intelligence in the universe. Researchers are still striving to understand the complexity of how the human brain works. The Deep Learning module acts like a black box, which takes inputs, does the processing in the black box, and gives the desired output. It helps us, with the help of GPUs and TPUs, to work with complex algorithms at a faster pace. The model developed could be reused for similar futuristic applications.



1.2 THE NEED: WHY DEEP LEARING?

 Deep Learning application have become an indispensable part of contemporary life. Whether we acknowledge it or not, there is no single day in which we do not use our virtual assistants like Google Home, Alexa, Siri and Cortana at home. We could commonly see our parents use Google Voice Search for getting the search results easily without requiring the effort of typing. Shopaholics cannot imagine shopping online without the appropriate recommendations scrolling in. We never perceive how intensely Deep Learning has invaded our normal lifestyles. We have automatic cars in the market already, like MG Hector, which can perform according to our communication. We already have hte luxury of smart phones, smart homes, smart electrical applicances and so forth. We invariably are taken to a new status of lifestyle and comfort with the technological advancements that happen in the field of Deep Learning.

1.1 INTRODUCTION

 Artificial Intelligence and Machine Learning have been buzz words for more than a decade now, which makes the machine an artificially intelligent one. The computational speed and enormous amounts of data have stimulated academics to deep dive and unleash the tremendous research  potential that lies within. Even though Machine Learning helped us start learning intricate and robust systems. Deep Learning has curiously entered as a subset for AI, producing incredible results and outputs in the field.

Deep Learning architecture is built very similar to the working of a human brain, whereby scientists teach the machine to learn in a way that humans learn. This definitely is a tedious and challenging task, as the working of the human brain itself is a complex phenomenon. Our research in the field has resulted in valuable outcomes to makes things easily understandable for scholar and scientists to build worthy applications for the welfare of society. They have made the various layers in neural nets in Deep Learning auto-adapt and learn according to the volume of datasets and complexity of algorithms.

The efficacy of Deep Learning algorithms is in no way comparable to traditional Machine Learning helped industrialists to deal with unsolved problems in a convincing way, opening a wide horizon with ample opportunity. Natual language processing, speech and image recognition, the entertainment sector, online retailing sectors, banking and finance sectors, the automotive industry, chat bots, recommender systems, and voice assistants to self-driving car are some of the major advancements in the field of Deep Learning.

CHAPTER1. Introduction to Deep Learning

 LEARNING OBJECTIVES

After reading through this chapter, the reader will understand the following:

- The need for Deep Learning

- What is the need of transition from Machine Learning to Deep Learning?

- The tools and languages available for Deep Learning

- Further reading


2022년 5월 17일 화요일

1.5 Deep Learing Framework

 If a workman wants to be good, he must first sharpen his weapon. After learning about the basic knowledge of deep learning, let's pick the tools used to implement deep learning algorithms.

1.4.3 Reinforcement Learning

 Virtual Games. Compared with the real environment, virtual game platforms can both train and test reinforcement learning algorithms and can avoid interference from irrelevant factors while also minimizing the cost of experiments. Currently, commonly used virtual game platforms include OpenAI Gym, OpenAI Universe, OpenAI Roboschool, DeepMind OpenSpiel, and MuJoCo, and commonly used reinforcement learning algorithms include DQN, A3C, A2C, and PPO. In the field of Go, the DeepMind AlphaGo program has surpassed human Go experts. In Dota2 and StarCraft games, the intelligent programs developed by OpenAI and DeepMind have also defeated professional teams under restriction rules.

Robotics. In the real environment, the ocntrol of robots has also made some progress. For example, UC Berkeley Lab has made a lot of progress in the areas of imitation learning, meta learning, and few-shot learning in the field of robotics. Boston Dynamics has made gratifying achievements in robot applications. The robots it manufactures perform well on tasks such as complex terrain walking and multi-agent collaboration (Figure 1-19).

Autonomous driving is considered as an application direction of reinforcement learning in the short term. Many companies have invested a lot of resources in autonomous driving, such as Baidu, Uber, and Google. Apollo from Baidu has begun trial operations in Beijing, Xiong'an, Wuhan, and other places. Figure 1-20 shows Baidu's self-driving car Apollo.

1.4.2 Natural Language Processing

 Machine Translation. In the past, machine translation algorithms were usually based on statistical machine translation models, which were also the technology used by Google's translation system before 2016. In November 2016, Google launched the Google Neural Machine Translation(GNMT) system based on the Seq2Seq model. For the first time, the direct translation technology from source lanuage to target language was realized with 50~90% improvement on multiple tasks. Commonly used machine translation models are Seq2Seq, BERT, GPT, and GPT-2, Among them, the GPT-2 model proposed by OpenAI has about 1.5 billion parameters. At the begining, OpenAI refused to open-source the GPT-2 model due to technical security reasons. Chatbot is also a mainstream task of natural language processing. Machines automatically learn to talk to humans, provide satisfactory automatic responses to simple human demands, and improve customer service efficiency and service quality. Chatbot is often used in consulting esystems, entertrainment systems, and smart homes.

1.4.1 Computer Vision

 Image classification is a common classification problem. The input of the neural network is pictures, and the output value is the probability that the current sample belongs to each category. Generally, the category with the highest probability is selected as the predicted category of the sample.

Image recognition is one of the earliest successful applications of deep learning. Classic neural network models include VGG series, Inception series, and ResNet series.

Object detection refers to the automatic detection of the approximate locationof common objects in a picture by an algorithm. It is usually represented by a bounding box and classifies the category information of objects in the bounding box, as shown in Figure 1-15. Common object detection algorithms are RCNN, Fast RCNN, Faster RCNN, Mask RCNN, SDD, and YOLO eries.

Semantic segmentation is an algorithm to automatically segment and identify the content in a picture. We can understand semantic segmentation as the classification of each pixel and analyze the category information of each pixel, as shown in Figure 1-16. Common semantic segmentationi models include FCN, U-net, SegNet, and DeepLab series.

Video Understanding. As Deep learning achieves better result on 2D picture-related tasks, 3D video understanding tasks with temporal dimention information (the third dimention is sequence of frames) are receiving more and more attention. Common video understanding tasks include video classification, behavior detection, and video subject extraction. Common models are C3D, TSN, DOVF, and TS_LSTM.

Image generation learns the distribution of real pictures and samples from the learned distribution to obtain highly realistic generated pictures. At present, common image generation models include VAE series and GAN series. Among them, the GAN series of algorithms have made great progress in recent years. The picture effect produced by the latest GAN model has reached a level where it is difficult to distingush the authenticity with the naked eye, as shown in Figure 1-17.

In addition to the preceding applications, deep learning has also achieved significant results in other areas, such as artistic style transfer(Figure 1-18), super-resolution, picture de-nosing/hazing, grayscale picture coloring and many others.


1.4 Deep Learning Applications

 Deep learning algorithms have been widely used in our daily life, such as vocie assistants in mobile phones, intelligent assisted driving in cars, and face payments. We will introduce some mainstream applications of deep learning starting with computer vision, natural language processing, and reinforcement learning.

1.3.4 General Intelligence

 In the past, in order to improve the performance of an algorithm on a certain task, it is often necessary to use prior knowledge to manually design corresponding features to help the algorithm better converge to the optimal solution. This type of feature extraction method is often strongly related to the specific task. Once the scenario changes, these artificially designed features and prior settings cannot adapt to the new scenario, and people often need to redesign the algorithms.

Designing a universal intelligent mechanism that can automatically learn and self-adjust like the human brain has always been the common vision of human beings. Deep learning is one of the algorithms closest to general intelligence. In the computer vision field, previous methods that need to desing features for specific tasks and add a priori assumptions have been abandoned by deep learning algorithms. At present, almost all algorithms in image recognition, object detection, and semantic segmentation are based on end-to-end deep learning models, which present good performance and strong adaptability. On the Atari game platform, the DQN algorithm designed by DeepMind can reach human equivalent level in 49 games under the same algorithm, model structure, and hyperparameter settings, showing a certain degree of general intelligence. Figure 1-14 is the network structure of the DQN algorithm. It is not designed for a certain game but can control 49 games on the Atari game platform

2022년 5월 16일 월요일

1.3.3 Network Scale

 Early perceptron models and multilayer neural networks only have one or two to four layers, and the network parameters are also around tens of thousands. With the development of deep learning and the improvement of computing capabilities, models such as AlexNet(8layers), VGG16(16 layers), GoogleNet(22 layers), REsNet50(50 layers), and DenseNet121(121 layers) have been proposed successively, while the size of inputting pictures has also gradually increased from 28 * 28 to 244 * 244 to 299 * 299 and even alrger. These changes make the total number of parameters of the network reach ten million levels, as shown in Figure 1-13.

The increase of network scale enhances the capacity of the neural networks correspondingly, so that the networks can learn more complex data modalities and the model performance can be improved accordingly. On the other hand, the increase of the network scale also means that we need more training data and computational power to avoid overfitting.

1.3.2 Computing Power

 The increase in computing power is an important factor in the third artificial intelligence renaissance. In fact, the basic theory of modern deep learning was proposed in the 1980s, but the real potential of deep learning was not realized until the release of AlexNet based on training on two GTX580 GPUs in 2012. Traditional machine learning algorithms do not have stringent requirements on data volume and computing power like deep learning. Usually, serial training on CPU can get satisfactory results. But deep elarning relies heavily on parallel acceleration computing devices. Most of current neural networks use parallel acceleration chips such as NVIDIA GPU and Google TPU to train model parameters. For example, the AlphaGo Zero program needs to be trained on 64 GPUs from scratch for 40 days before surpassing all AlphaGo historical versions. The automatic network structure search algorithm used 800 GPU s to optimize a better network strtucture.

At present, the deep elarnign acceleration hardware device that ordinary consumers can sue are mainly from NVIDIA GPU from 2008 to 2017. It can be seen that the curve of x86 CPU changes relatively slowly, and the floating-point computing capacity of NVIDIA GPU grows exponentially which is mainly driven by the increasing business of game and deep learning computing.

1.3.1 Data Volume

 Early machine learning algorithms are relatively simple and fast to train, and the size of the required dataset is relatively small, such as the Iris flower dataset collected by the Brithish statistician Ronald Fisher in 1936, which contains only three categories of flowers, with each category having 50 samples, With the development of computer technology, the designed algorithms are more and more complex, and the demand for data volume is also increasing. The MNIST handwritten digital picture dataset collected by Yann LeCun in 1998 contains a total of ten categories of numbers from 0 to 9, with up to 7,000 pictures in each category. With the rise of neural networks, especially deep learning networks, the number of network layers is generally large, and the number of model parameters can reach one million, ten million, or even one billion. To prevent overfitting, the isze of the training dataset is usually huge. The popularity of modern social media also makes it possible to collect huge amounts of data. For example, the ImageNet dataset released in 2010 included a toal of 14,197,122 pictures, and the compressed file size of the entire dataset was 154GB, Figures 1-10 and 1-11 list the number of samples and the size of the data set over time.

Although deep learning has a high demand for large datasets, collecting data, especially collecting labeled data, is often very expensive. The formation of dataset usually requires manual collection, crawling of raw data and cleaning out invalid samples, and then annotating the data samples with human intelligence, so subjective bias and random errors are inevitably introduced. Therefore, algorithms with small data volume requirement are very hot topics.


1.3 Deep Learning Characteristics

 Compared with traditional machine learning algorithms and shallow neural networks, modern deep learning algorithms usually have the following characteristics.

1.2.2 Deep Learning

 In 2006, Geoffirey Hinton et al. found that multilayer neural networks can be better trained through layer-by-layer pre-training and achieved a better error rate than SVM on the  MNIST handwritten digital picture data set, turning on the third artificial intelligence revival. In that paper, Geoffrey Hinton first proposed the concept of deep learning. In 2011, Xavier Glorot proposed a Rectified Linear Unit (ReLU) activation function, which is one of the most widely used activation functions now. In 2012, Alex Krizhevsky propeosed an eight-layer deep neural network AlexNet, which used the ReLU activation functio nand Dropout technology to prevent overfitting. At the same time, it abandoned the layer-byt-layer pre-training method and directly trained the network on two NVIDIA GTX580 GPUs. AlexNet won the first place in the ILSVRC-2012 picture recognition competition, showing a stunning 10.9% reduction in the top-5 error rate compared with the second place.

Since the AlexHNet model was developed, various models have been published successively, including VGG series models increase the number of layers in the network to hundreds or even thousands while manintaining the smae or even better performance, which is the most representative model of deep learning.

In addition to the amazing results in supervised learning, huge achievements have also been made in unsupervised learning and reinforcement learning. In 2014, Ian Goodfellow proposed generative adversarial networks(GANs), which learned the true distribution of samples through adversarial training to generate samples with higher approximation. Since then, a large number of GAN models have been proposed. The latest image generation models can generate images that reach a degree of fidelity hard to discern from the naked eye. In 2016, DeepMind applied deep neural networks to the field of reinforcement learning and proposed the DQN algorithm, which achieved a level comparable to or even higher than that of humans in 49 games in the Atarigame platform. In the field of Go, AlphaGo and AlphaGo Zero intelligent programs from Deep Mind have successively defeated hyuman top Go players Li Shishi, Ke jie, etc. In the multi-agent collaboration Dota 2 game platform, OpenAI five intelligent programs developed by OpenAI defeated the T18 champion team OG in a restricted game environment, showing a large number of professional high-level intelligent operations. Figure 1-9 lists the major time points between 2006 and 2019 for AI development.

2022년 5월 15일 일요일

1.2.1 Shallow Neural Netorks

 In 1943, psychologist Warrent McCulloch and logician Walter Pitts proposed the earliest mathematical model of neuraons based on tghe structure of biological neuraons, called MP neuraon models after their last name initials. The model f(x)=h(g(x)), where g(x)=iXi, Xi∈{0,1}, takes values from g(x) to predict output values as shown in Figure 1-4. If g(x) >=0, output is 1; if g(x) < 0, output is 0. The MO neuraon models have no learning ability and can onlyu complete fixed logic judgments.

In 1958, American psychologist Frank Rosenblatt proposed the first neuron model that can automatically learn weights, called perceptron. As xshown in Figure 1-5, the error between the output value 0 and the true value y is used to adjust the weights of the neuraons {w1, w2, w3...wn}. Frank Rosenblatt then implemented the perceptron model based on the "Mark 1 perceptron" hardware. As shown in Figures 1-7 and 1-7, the input is an image sensor with 400 pixels, and the output has eight nodes. It can successfully identify some English letters. It is generally believed that 1943-1969 is the first prosperous period of artificial intelligence development.

In 1969, the American scientist Marvin Minsky and others pointed out the main flaw of linear models such as perceptrons in the book Perceptrons. They found that perceptrons cannot handle simple linear inseparable problems such as XOR. This directly led to the trough period of perceptron-related research on neural networks. It is generally considered that 1969-1982 was the first winter of artificial intelligence.

Although it was in the trough period of AI, there were still many significant studies published one after another. The most important one is the backpropagation(BP) algorithm, which is still the core foundation of modern deep learning algorithms. In fact, the mathematical idea of the BP algorithm has been derived as early as the 1960s, but it had not been applied to neural networks at that time. In 1974, American scientist Paul Werbos first proposed that the  BP algorithm can be applied to neural networks in his doctoral dissertation. Unfortunately, this result has not received enough attention. In 1986, David Rumelhard et al. published a paper using the BP algorithm for feature learning in Nature, Since then, the BP algorithm started gaining widespread attention.

In 1982, with the introduction of John Hopfield[s cyclically connected Hopfield network, the second wave of artificial intelligence renaissance was started from 1982 to 1995. During this period, convolutional neural networks, recurrent neural networks, and backpropatation algorithms were developed one after another. In 1986, David Rumelhart, Geoffreey Hiton, and other applied the BP algorithm to multilayer perceptrons, in 1989, Yann LeCun and other applied the BP algorithm to handwritten digital image recognition and acghieved great success, which is known as LeNet. The LeNet system was successfully commericalled in zip code recognition, bank check recognition, and many other systems. In 1997, one of the most widely used recurrent neural network variants, Long ShortTerm Memory(LTSM), was proposed by Jurgen Schmidhuber. In the same year, a bidrectional recurrent neural network was also proposed.

Unfortunately, the study of neural networks has graduallyu entered a though with the rise of traditional machine learning algorithms represented by support vector machines(SVNs), which is known as the second winder of artificial intelligence. Suppport vaector amchines have a rigoroutstheoretical founda5tion, requre a small number of training samples,a nd also have good generalization capabilities. In contrast, neural networks lack theorerical foundation and are hard to interpret. Deep networks are difficult to train, and the performance is normal. Figure 1-8 shows the significant time of AI developemnt between 1943and 2006



2022년 5월 14일 토요일

1.2 The History of Neural Networks

 We divide the development of neural networks into shallow neural networks stages and deep learning stages, with 2006 as the dividing point. Before 2006, deep learning developed under the name of neural networks and experienced two ups and two downs. In 2006, Geoffrey Hinton first named deep neural networks as deep learning, which started its third revival.


1.1.3 Neural Networks and Deep Learning

 Neural network algorithms are a class of algorithms that learn from data based on neyural networks. They still belong to the category of machine learning. Due to the limitation of computing power and data volume, early neural networks were shallow, usually with around one to four layers. Therefore, the network expression ability was limited. With the improvement of computing power and the arrival of the big data era, highly parallelized graophics processing units(GPUs) and massive data make training of large-scale neural networks possible.

In 2006, Geoffrey Hinton first proposed the concept of deep learning. In 2012, AlexNet, and eight-layer deep nerural network, was released and achieved huge performance imprevements in the image recongnition competition. Since then, neural network models with dozens, hundreds, and even thousands of layers have been developed successively, showing strong learning ability. Algorithms implemented using deep neural networks are generallty referred to as deep learning models. In essence, neural networks and deep learning can be considered the same.

Let's simply compare deep learning with other algorithms. As shown in Figure 103, rule-baed systems usually write explicit logic, which is generally designed for specific tasks and is not suitable for other tasks. Traditional machine learning algorithms artificially design feature detection methods with certain generality, such as SIFT nad HOG features. These featureas are suitable for a certain type of tasks and have certain generality. but the performance highly depends onb how to desing those features. The emergence of neural hetworks ahs made it possible for omputers to design those features automatically through nerual nwtworks without human intervention. Shallow nerual networks typically have limited feature extraction capability, while deep neural networks are capable of extracting high-level, abstract features and have better performance.

1.1.2 Machine Learning

 Machie learning can be divided into supervised learning, unsupervised learning, and reinforcement learning, as shown in Figure 1-2.

Supervised Learning. The supervised learning data set contains samples x and sample labels y. The algorithm needs to learn the mapping relationship f0: x->y, where f0 represents the model function and  are the parameters of the model. During training, the model parameters  are optimized by  minimizing errors between the model prediction and the real value y, so that the model can have more accurate prediction. Common supervised learning models include linear regression, logistic regression, support vector machines (SVMs), and random forests.

Unsupervised Learning. Collecting labeled data is often more expensive. For a sample-only data set, the algorithm needs to discover the modalities of the data itself. This kind of algorithm is called unsupervised learning. One type of algorithm in unsupervised learning uses itself as a supervised signal, that is, f:x->x, which is known as self-supervised learning. During training, parameters are optimized by minimizing the error between the model's predicted value f(x) and itself x, common unsupervised learning algorithms include self-encoders and generative adversarial networks(GANs)

Reinforcement Learning. This is a type of algorithm that learns strategies for solving problems by  interacting with the environment, Unlike supervised and unsupervised learning, reinforcement learning problems do not have a clear "correct" action supervision signal. The algorithm needs to interact with the environment to obtain a lagging reward signal from the environmental feedback. Therefore, it is not possible to calculate the errors between model Reinforcement Learning prediction and "correct values" to optimize the network directly. Common reinforcement learning algorithms are Deep Q-Networks(DQNs) and Proximal Policy Optimizaiton(PPO).


1.1.1 Artificial Intelligence Explained

 AI is a technology that allows machines to acquire intelligenct and inferential mechanisms like humans. this concept first appeared at the Dartmouth Conference in 1956. This is a very challenging task. At present, human beings cannot yet have a comprehensive and scientific understanding of the working mechanism of the human brain. It is undoubtedly more difficult to make intelligent machines that can reach the level of the human brain. With that being said, machines that archive similar to or even supass huyman intelligence in some way have been proven to be feasible.

How to relize AI is very broad question. The development of AI has mainly gone thorough three stages, and each stage represent the exploration footprint of the human trying to realize AU from differenct angles. In the early stage, people tried to develop intelligent systems by suymmarizing and generalizing some logical rules and implementing them in the form of computer programs. But such explicit rules are often too simple and are difficult to be used to express complex and abstract concepts and rules. This stage is called the inference period.

In the 1970s, scientists tried to implement AI though knowledge database and reasoning. They built a large and complex expert system to simulate the intelligence level of human experts. One of the biggest difficulties with theses explicitly specified rules is that many complex, abstract concepts cannot be implemented in concrete code. For example, the process of human recognition of pictures and understanding of languages cannot be simulated by established rules at all. To solve such problems, a research discipline that allowed machines to automatically learn rules from data, known as machine learning, was born, Machine learning become a popular subject in AI in the 1980s, This is the second stage.

In machine learning, there is a directiion to learn complex, abstract logic through neural networks, Research on the direction of neural networks has experienced two ups and downs. Since 2012, the applications of deep neural network technology have made major breakthroughs in fields like computer vision, natural language processing(NLP), and robotics. Some tasks have even surpassed the level of human intelligence. This is the third revival of AI. Deep neural networks eventually have a new name -  deep learning. Generally speaking, the sessential difference between neural networks and deep learning is not large. Deep learning refers to models or algorithms based on deep neural networks. The relationship between artificial intelligence, machine learning, neural networks, and deep learning is shown in Figure 1-1.



1.1 Artificial Intelligence in Action

 Information technology is the third industrial revolution in human history. The popularity of computers, the Internet, and smart home technology has greatly facilitated people's daily lives. Through programming, humans can hand over the interaction logic designed in advance to the machine to execute repeatedly and quickly, thereby freeing humans from simple and tedious repetitive labor. However, for tasks that require a high level of intelligence, such as face recognition, chart robots, and autonomous driving, it is difficult to design clear logic rules. Therefore, traditional programming methods are powerless to those kinds of tasks, whereas artificial intelligence(AI), as the key technology to solve this kind of problem, is very promising.

Wigh the rise of deep learning algorithms, AI has achieved or even surpassed hymanlike intelligence on some tasks. For example, the AlphaGo program has defeated Ke Jie, one of the stongest human Go players, and OpenAI Five has beaten the champion team OG on the Dota2 game. In the meantime, practical technologies such as face recognition, intelligent speech, and machine translation have entered people's daily lives. Now our lives are actully surrounded by AI. Although the current level of intelligence that can be reached is still a long way from artificial general intelligence(AGI), we still firmly believe that the era of AI has arrived.

Next, we will introduce the concepts of AI, machine learning, and deep learning, as well as the connecctions and differences between them.


CHAPTER 1 Introduction to Artificial Intelligence

 What we want is a machine that can learn from experience. -Alan Turing

2022년 5월 7일 토요일

1.11.6 Open-Source Systems as Learning Tools

 The free-software movement is driving legions of programmers to create thousands of open-source projects, including operating systems. Sites like http://freshmeat.net/ and http://distrowatch.com/ provide portals to many of these projects. As we started earlier, open-source projects enable students to use source code as a learning tool. They can modify programs and test them, help find and fix bugs, and otherwise explore mature, full-featured operating systems, compilers, tools, user interfaces, and other types of programs. The availability of source code for historic projects, such as Multics, can help students to understand those projects and to build knowledge that will help in the implementation of new projects.

Another advantage of working with open-source operating systems is their diversity. GNU/Linux and BSD UNIX are both open-source operating systems, for instance, but each has its own goals, utility, licensing, and purpose. Sometimes, licenses are not mutually exclusive and cross-pollination occurs, allowing rapid improvements in operating-system projects. For example, several major components of OpenSolaris have been ported to BSD UNIX. The advantages of free software and open sourcing are likely to increase the number and quality of open-source projects, leading to an increase in the number of individual and companies that use these projectgs.


1.11.5 Solaris

 Solaris is the commercial UNIX-based operating system of Sun Microsystems. Iriuginally, Sun's SunOS operating sysstem was based on BSD UNIX. Sun moved to AT&T's System V UNIX as its base in 1991. In 2005, Sun open-sourced most of the Solaris code as the OpenSolaris project. The purchase of Sun by Oracle in 2009, however, left the state of this project unclear.

Several groups interested in using OpenSolaris have expanded its features, and their working set is Project Illumos, which has expanded from the Open-Solaris base to include more features and to be the basis for several products. Illumos is available at http://wiki.illumos.org.

THE STUDY OF OPERATING SYSTEMS

 There has never been a more interesting time to study operating systems, and it has never been easier. The open-source movement has overtaken operating system, causing many of them to be made available in both source and binary (executable) format. The list of operating systems available in both formats includes Linux. BSD UNIX, Solaris, and part of macOS. The availability of source code allows us to study operating systems from the inside out. Questions that we could once answer only by looking at documentation or the behavior of an operating system we can now answer by examining the code itself.

Operating systems that are no longer commerically viable have been open-souced as well, enabling us to study how systems operated in a time of fewer CPU, memory, and storage resources. An extensive but incomplete list of open-soiurce operating-system projects is available from http://dmoz.org/Computers/Software/Operating.Systems/Oepn.Sources/. computer function makes it possible to run many operating systems on top of one core system. For example, VMware (http://www.virtualbox.com) provides a free, open-source virtual machine manager on many operating systems. Using such tools, students can try out hundreds of operating systems without dedicated hardware.

In some cases, simulators of specific hardware are also available, allowing the operating system to run on "native" hardware, all within the confines of a modern computer and modern operating system. For example, a DECSYSTEM-20 simulator running ono macOS can boot TOPS-20, load the source topes, and modify and comile a new TOPS-20 kernel. An interested student can search the Internet to find the original papers that describe the operating system, as well as the original manuals.

The advent of open-source operating systems has also made it easier to make the move from student to operating-system distribution. Not so many years ago, it was difficult or impossible to get access to source code. Now, such access is limited only by how much interest, time, and disk space a student has.


1.11.4 BSD UNIX

 BSD UNIX has a longer and more complicated history than Linux. It started in 1978 as a derivative of AT&T's UNIX. Releases from the University of California at Berkeley (UCB) came in source and binary from, but they were not open source because alicense from AT&T was required. BSD UNIX's development was slowed by a lawsuit by AT&T, but eventually a fully functional, open-source version, 4.4BSD-lite, was released in 1994.

Just as with Linux, there are many distributions of BSD UNIX, including FreeBSD, NetBSD, OpenBSD, and DragonflyBSD. To explore the source code of ReeBSD, simply download the virtual machine image of the version of interest and boot it within Virtualbox, as described above for Linux. The source code comes with the distribution and is stored in /usr/src/. The kernel source code is in /usr/src/sys. For example, to examine the virtual memory implementation code in the FreeBSD kernel, see the files in /usr/src/sys/vm. Alternatively, you can simple view the source code online at https://svnweb.freebsd.org.

As with amnyu open-source projects, this source code is contained in and controlled by a version control system-in this case, "subversion" (https://subversion.apache.org/source-code). Version control systems allow a user to "pull" an entire source code tree to his computer and "push" any changes back into the repository for others to then pull. These systems also provide other features, including an entire history of each file and a conflict resolution feature in case the same file is changed concurrently. Another version control system is git, which is used for GNU/Linux, as well as other programs (https://www.git-scm.com).

Darwin, the core kernel component of macOS, is based on BSD UNIX and is open-sourced as well. That source code is available from http://www.opensource.apple.com/. Every macOS release has its open-source components posted at that site. The name of the package that contains the kernel begins with "xnu". Apple also provides extensive developer tools, documentation, and support at http://developer.apple.com.


1.11.3 GNU/Linux

 As an Example of a free and open-source operating system, consider GNU/Linux. By 1991, the GNU operating system was nearly complete. The GNU Project had developed compilers, editors, utilities, libraries, and games - whatever parts it could not find dlsewhere. However, the GNU kernel never became ready for prime time. In 1991, a student in Finland, Linus Torvalds, released a rudimentary UNIX-like kernel using the GNU compilers and tools and invited contributions worldwide. The advent of the Internet meant that anyone interested could download the source code, modify it, and submit changes to Torvalds. REleasing updates once a week allowed this so-called "Linux" operating system to grow rapidly, enhanced by several thousand programmers. In 1991, Linux was not free software, as its license permitted only noncommercial redistribution. In 1992, however, Torvalds rereleased Linux under th GPL, making it free software (and also, to use a term coined later, "open source")

The resulting GNU/Linux operating system (with the kernel properly called Linux but the full operating system including GNU tools called GNU/Linux) has spawned hundreds of unique distributions, or cutom builds, of the system. Major distributions include Red Hat, SUSE, Fedora, Devian, Slackware, and Ubuntu. Distributions vary in function, utility, installed applications, hardware support, user interface, and purpose. For example, Red Hat Enterprise Linux is geared to large commercial use. PCLinuxOS-called PCLinuxOS Supergamer DVD - is a live DVD that includes graphics drivers and games. A gamer can run it on any compatible system simply by booting from the DVD. When the gamer is finished, a reboot of the system resets it to its installed operating system.

You can run Linux on a Windows(or other) system using the following simple, fre approach:

1. Downlaod the free Virtualbox VMM tool from

https://www.virtualbox.org/

and install it on your system.

2. Choose to install an operating system from scratch, based on an installation image like a CD, or choose pre-built operating-system images that can be installed and run more quickly from a site liek

http://virtualboxes.org/images/

These image are preinstalled with operating system and application and include many flavors of GNU/Linux.

3. Boot the virtual machine within Virtuallbox.

An alternative to using Virtualbox is to use the free program Qemu(http://wiki.qumu.org/Download/), which includes the qemu-img command for converting Virtaulbox images to Qemu images to easily import them.

With this text, we provide a virtual machine image of GNU/Linux running the Ubuntu release. This iamge contains the GNU/Linux source code as well as tools for software developement. We cover examples involving the GNU/Linux image throughout this text, as well as in a detailed case study in Chapter 20.


1.11.2 Free Operating Systems

 To counter the move to limit software use and redistribution, Richard Stallman in 1984 started developing a free, UNIX-compatible operating system called GNU(which is a recursive acronym for "GNU's Not Unix!"). To Stallman, "free" refers to freedom of use, not price. The free-software movement does not object to trading a copy for an amount of money but holds that users are entitled to four certain freedoms: (1) to freely run the program, (2) to study and change the source code, and to give or sell copies either (3) with or (4) without changes. In 1985, Stallman published the GNU Manifesto, which argues that all software should be free. He also formed the Free Software Foundation (FSF) with the goal of encouraging the use and developement of free software.

The FSF uses the copyrights on its programs to implement "copyleft," a form of licensing invented by Stallman. Copylefting a work gives anyone that prossesses a copy of the work the four essential freedoms that make the work free, with the condition that redistribution must preserve these freedoms. The GNU General Publish License (GPL) is a common license under which free software is released. Foudamentally, the GPL requires that the source code be distributed with any binaries and that all copies (including modified versions) be released under the same GPL license. The Creative Commons "Attribution Sharealike" license is also a copyleft license; "sharealike" is another way of stating the idea of copyleft.

1.11.1 History

 In the early days of modern computing (that is, the 1950s), software generally came with source code. The original hackers (computer enthusiasts) at MIT's came Tech Model Railroad Club left their programs in drawers for others to work on. "Homebrew" user groups exchange code during their meetings. Company-specific user groups, such as Digital Equipement Corporation's DECUS, accepted contributions of source-code programs, collected  them onto tapes, and distributed the tapes to interested members. In 1970, Digital's operating systems were distributed as source code with no restrictions or copyright notice.

Computer and software companies eventually sought to limit the use of the binary files compiled from the source code, rather than the source code itself, helped them to achieve this goal, as well as protecting their code and their ideas from their competitors. Although the Homebrew user groups of the 1970s exchanged code during their meetings, the operating systems for hobbyist machines(such as CPM) were proprietary. By 1980, proprietary software was the usual case.

1.11 Free and Open-Source Operating Systems

 The study of operating systems has been made easier by the availability of a vast number of free software and open-source releases. Both free operating systems and open-source operating systems are available in source-code format rather than as compiled binary code. Note, though, that free software and open-source software are two different ideas championed by different groups of people (see http://gnu.rog/phliosophy/open-source-misses-the-point.html/ for a discussion on the topic). Free software (sometimes referred to as freellibresoftware) not only makes source code available but also is licensed to allow no-cost use, redistribution, and modification. Open-source software does not necessarily offer such licensing. Thus, although all free sotrware is open source, some open-source software is not "free." GNU/Linux is the most famous open-source operating system, with some distributions free and others open source only (http://www.gnu.org/distros/). Microsoft Windows is a well-known example of the opposite closed-cource approach. Wiundows is proprietary software-Microsoft owns it, restricts its use, and carefully protects its source code. Apple's macOS opoerating system comprises a hybrid approach. It contains an open-source kernel named Darwin but includes proprietary, closed-source components as well.

Starting with the source code allows the programmer to produce binary code that can be executed on a system. Doing the opposite-reverse engineering the source code from the binaries-is quite a lot of work, and useful items such as comments are never recovered. Learning operating systems by examining the source code has other benefits as well. With the source code in hand, a student can modify the operating system and then compile and run the code to try out these changes, which is an excellent learning tool. This text includes projects that involve modifying operating-system source code, while also describing algorithms at a high level to be sure all important operating-system topics are covered. Throughout the text, we provide pointers to examples of open-source code for deeper study.

There are many benefits to open-source operating systems, including a community of interested(and usually unpaid)programmers who contribute to the code by helping to write it, debug it, analyze it, provide support, and suggest changes. Arguably, open-source code is more secure than closed-source code because many more eyes are viewing the code. Certainly, open-source code has bugs, but open-source advocates argue that bugs tend to be found and fixed faster owing to the nbumber of people using and viewing the code. Companies that earn revenue from selling their programs ofter hesitate to open-source their code, but RED Hat and a myriad of other companies are doing just that and showing that commercial companies benefit, rather than suffer, when they open-source their code. Revenue can be generated though support contracts and the sale of hardware on which the software runs, for example.


2022년 5월 4일 수요일

1.10.6 Real-Time Embedded Systems

 Embedded computers are the most prevalent form of computers in existence. These devices are found everywhere, from car engines and manufacturing robots to optical drives and microwave ovens. They tend to have very specific tasks. The systems they run on are usually primitive, and so the operating, pefferring to spend their time monitoring and managing hardware devices, such as automobile engines are robotic arms.

These embedded systems vary considerably. Some are general-purpose computers, running standard operating systems-such as Linux-with special-purpose applications to implement the functionality. Others are hardware devices with a special-purpose embedded operating system providing just the functionality desired. Yet others are hardware devices with application-specific integrated circuits(ASICs) that perform their tasks without an operating system.

The use of embedded system continues to expand. The power of these devices, both as standalone units and as elements of networks and the web, is sure to increase as well. Even now, entire house can be computerized, so that a cetral computer-either a general-purpose computer or an embedded system-can access can enable a home owner to tell the house to heat up before she arrives home. Someday, the refrigerator will be able to notify the grocery store when it notices the milk is gone.

Embedded systems almost always run real-time operating systems. A realtime system is used when rigid time requirements have been placed on the operation of a processor or the flow of data; thus, it is often used as a control device ina dedicated applicaiton. Sensors bring data to the computer. The computer must analyze the data and possibly adjust controls to modify the sensor inputs. Systems that control scientfic experiments, medical imaging systems, industrial control systems, and certain display systems are real-time systems. Some automobile-engine fuel-injection systems, home-appliance controllers, and weapon systems are also real-time systems.

A real-time system has well-defined, fixed time constraints. Processing must be done within the defined constraints, or the system will fail. For instance, it would not do for a robot arm to be instructed to halt after it had smashed into the car it was building. A real-time system functions correctly only fi ti return the correct result within tis time constraints. Contrast this system with a traditional laptop system where it is desirable(but not mandatory) to respond quickly.

In Chapter 5, we consider the scheduling facility needed to implement realtime functionality in an operating system, and in Chapter 20 we describe the real-time components of Linux.