The core of deep learning is the design idea of algorithms, and deep learning frameworks are just our tools for implementing algorithms. In the following, we will demonstrate the three core functions of the TensorFlow deep learning framework to help us understand the role of frameworks in algorithm design.
a) Accelerated Calculation
The neural network is essentially comoposed of a large number of basic mathematical operations such as matrix multiplication and addition. One important function of TensorFlow is to use the GPU to conveniently implement parallel computing acceleration functions. In order to demonstrate the acceleration feffect of GPU, we can compare mean running time for multiple matirx multiplications on CPU and GPU as follows.
We create two matrices A and B with spahe [1,n] and [n,1], separately. The size of the matrices can be adjusted using parameter n. The code is as follows:
# Create two matriees running on CPU
with tf.device('/cpu:0'):
cpu_a = tf.random.normal([1,n])
cpu_b =tf.random.normal([n,1])
print(cpu_a.device, cpu_b.device)
#Create two matrices running on GPU
with tf.device('/gpu:0'):
gpu_a = tf.random.normal([1,n])
gpu_b = tf.random.normal([n,1])
print(gpu_a.device, gpu_b.device)
Let's implement the ufnctions of the CPU and GPU operations and measuer the computation time of the two functions through the imeit. itmeit() function. It should be noted that additional environment initialization work is generally required for the first calculation, so this time cannot be counted. We remove this time through the warm-up session and then measuer the calculation time as follows:
def cpu_run(): # CPU function
with tf.device('/cpu:0'):
c = tf.matmul(cpu_a, cpu_b)
retun c
def gpu_run(): # GPU function
with tf.device('/gpu:0'):
c = tf.matmul(gpu_a, gpu_b)
return c
#First calcualtion needs warm-up
cpu_time = timeit.timeit(cpu_rn, number=10)
gpu_time = timeit.timeit(gpu_run, number=10)
print('warmup:', cpu_time, gputime)
# Calculate and print mean running time
cpu_time = timeit.timeit(cpu_run, number=10)
gpu_time = timeit.timeit(gpu_run, number=10)
print('run time:', cpu_time, gpu_time)
We plot the computation time under CPU and GPU environments at different matrix sizes as shown in Figure 1-21. It can be seen that when the matrix size is small, the CPU and GPU times are almost the same, which does not reflect the advantages of GPUY parallel computing. When the matrix size is larger, the CPU computing time significantly increases, and the GPU takes full advantage of paralled computing without almost any change of computation time.
b) Automatic Gradient Calculation
When using TensorFlow to construct the forward caculation process, in addition to being able to obtain numberical results, TensorFlow also automatically builds a computational graph. TensorFow provides automatic differentiation that can calculate the derivative of the output on network parameters without manual derivation. Consider the expression of the following function:
y = aw2 + bw +c
The derivative relationship of the output y th the variable w is
dy/dw = 2aw +b
Consider the derivative at (a,b,c,w) = (1,2,3,4). We cat get dy/dw = 2*1*4 + 2 = 10
With TensorFlow, we can directly calculate the derivative given the expression of a function without manaully deriving the expression of the derivatives. TensorFlow can automatically derive it. The code is implemented as follows:
import tensorflow as tf
# Create 4 tensors
a = tf.constant(1.)
b = tf.constant(2.)
c = tf.constant(3.)
w = tf.constant(4.)
with tf.GradientTape() as tape:# Track derivative tape.watch([w]) # Add w to derivative watch list
# Design the function
y = a * w w**2 _ b * w + c
# Auto derivative calculation
[dy_dw] = tape.gradient(y, [w])
print(dy_dw) # print the derivative
The result of the program is
tf.Tensor(10.0, shape=(), dtype=float32)
It con be seen that the result of TensorFlow's automatic differentiation is consistent with the result of manual calculation.
c) Common Neural Network Interface
In addition to the underlying mathematical functions such as matrix multiplication and addition, TensorFlow also has a series of convenient functions for deep learning systems such as commonly used neural network operation functions, commonly used network layers, network training, model saving, loading, and deployment. Using TensorFlow, you can easily use thes functions to complete common production processes, which is efficient and stable.