Online College Support
 
STUDENT
 
FACULTY
 
SCHOOL
 
SUPPORT
 
PUBLIC
 
SIGNUP
DAILY QUIZ
 
     
  B U L L E T I N    B O A R D

Install Python TensorFlow Package with GPU Support in Windows 10/11

(Subject: Data Analytics/Authored by: Liping Liu on 5/13/2024 4:00:00 AM)/Views: 1369
Blog    News    Post   

TensorFlow is one of the two important extensions to Python for machine learning (the other one is PyTorch created by Meta). However, its installation can become very complicated given various conflicts between different versions of dependent packages. The problem gets worse if we want to take advantages of GPU computational capabilities. In addition, google just announced that TensorFlow 2.10 was the last TensorFlow release that supported GPU on native-Windows. Starting with TensorFlow 2.11, we have to install TensorFlow in Windows Subsystem for Linux. This instruction provides an easy to follow and verified approach to TensorFlow 2.10 to Windows 10/11 for Python 3.9.

Prerequisites:

  1. Check the list of CUDA®-enabled GPU cards to see if you have a supported graphic card
  2. Make sure you have installed the most recent NVidia Game Ready driver for your GPU card. 
  3. Make sure you Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017, 2019, 2022. If not, go to the Microsoft Visual C++ downloads and download and install the Microsoft Visual C++ Redistributable for Visual Studio 2015, 2017 and 2019 for your platform. If you already have Visual Studio installed, make sure you open Visual Studio Installer and modify to include C++ language compiler (see the photo below for choices). If you don't have it, you can download the visual studio installer from https://visualstudio.microsoft.com/downloads/. Double click the file and make sure that you have the highlighted (“Desktop development with C++”) workloads installed. This will install the required C++ toolchain.

 

Installation Steps:

  1. Install Minconda with Python: go to anaconda.com and download Miniconda3-py39_23.5.2-0-Windows-x86_64.exe for the version that comes with Python 3.9. (See this guide for details):
    • Choose to install for all users and file location as C:\ProgramData\miniconda3
    • Check the checkbox "Register Miniconda3 s the system Python"
    • Add both Miniconda installation location C:\ProgramData\miniconda3\condabin and Anaconda binary code folder C:\ProgramData\miniconda3\condabin to the system PATH.

2.  Click Windows menu, right mouse click on Anaconda Powershell Prompt and choose to Run as Administrator and run the commands one by one to create a new virtual environment called TF and install required packages in it. 

Note the following when running the commands: 

  • Make sure you run the commands one by one, not all of them once. Press key "y" when prompted with "Proceed ([y]/n)?"
  • I stall all packages using conda except for tensorflow pacakge.  "conda install tensorflow==2.10" will not enable GPU support for some reason. So I use pip to install tensorflow,
  • In the future, try to use conda to install new packages. Mixing pip and conda may cause package configuration conflicts. 
  • The following commands will install not only tensor flow with GPU support but also two IDEs: Jupyter Lab and Spyder, which will also have GPU supports. 
  • All the commands should run without any warnings or errors. If there is any, go back to the prior steps to check if some steps were not performed or performed incorrectly.
  • Install CUDa Toolkit 11.2 and cuDNN version 8.1.0 for TensorFlow 2.10 in Python 3.9 (see the bottom of the page to decide CUDA and cuDNN version if you have a VERY OLD computer). TensorFlow 2.10 is the last version to support native Windows.  Python versions compatible to Tensor Flow library are 3.9-3.11.   
        conda create --name tf python=3.9

   conda activate tf

   conda install -n base conda-libmamba-solver

         conda install -c conda-forge cudatoolkit=11.2 cudnn=8.1.0 --solver=libmamba
         conda install pandas

         conda install packaging==23.0

         conda install toolz

         python -m pip install --upgrade pip

         pip install tensorflow==2.10 

conda install numpy

conda install scikit-learn

conda install scipy

conda install seaborn

conda install matplotlib

         conda install spyder  

conda install notebook 

conda install ipykernel

python -m ipykernel install

conda install jupyterlab

Now, verify the GPU setup:

        python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
        
Verify the CPU setup:
        python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"

     # you can type python and go to python and verify the following:
     
        from tensorflow.python.client import device_lib
        print(device_lib.list_local_devices())

     # you can also type "jupyter lab" to open Jupyter Notebook and to run the same commmands there:
        from tensorflow.python.client import device_lib
        print(device_lib.list_local_devices())

In case you need to remove the virtual environment "tf", first exit out of tf environment using command "conda deactivate". To list all environments, use command "conda env list". To remove the "tf" environmenent,
    conda remove --name tf --all

Note that each time to open Anaconda Prompt, make sure to re-activate tf environment before going to Python, Jupyter Notebook, and run simply run python scripts:


        conda activate tf

Then you can type "spyder" command to open Spyder IDE or "jupyter lab" to open Jupyter Notebook. For Jupyter Notebook, change the directory to the root directory by using command "cd c:\" before the command "jupyter lab" so that you can save and open notebooks to your desired locations.

        

Experimental Example: Movie Reviews Classification

In this experiment, I tested the advantage of TensorFlow using GPU by comparing two machines that run the same TensorFlow sequential model. The first machine is Microsoft Surface Studio 2 with Intel i7 and 16GB RAM. I run the training task without using GPU acceleration. The second machine is HP Omen 45L with Intel i7 with 64GB RAM and NVidia RTX 3080 graphics card, which was used for TensorFlow acceleration. The difference is dramatic. The training task takes 3 minutes and 30 seconds on the second machine and over 107 minutes in the first machine. Even on the same second machine, without GPU acceleration, the time it takes is at least 10 times more.  

In this task, we will load a data set of 50,000 movie reviews from IMDB and their class labels. We will try to train a neural network using 25000 reviews and then use the trained model to predict the labels for the remaining 25000 reviews. First, let us load TensorFlow library and its IMDB data set:

import tensorflow as tf
from tensorflow import keras

# Load data
imdb = keras.datasets.imdb

The actual reviews are made of words but have been coded using word index. We can look up the index, which maps each word to an index number, as follows:

# A dictionary mapping words to integers
word_index = imdb.get_word_index()

The word_index is a dictionary of all words and their indices, and so we can get a list of words using word_index.keys() and a list of indices using word_index.values(), from which, we can find the range of the index values is 1:88584 and there are 88584 words in the vocabulary.

import numpy as np

values = list(word_index.values())

max = np.max(values)

maxPosition = np.argmax(values)

words = list(word_index.keys())

word = words[maxPosition]

size = len(values)

print(f'there are {size} words in vocabulary, the largest index is {max} which is for the word {word}')

We can reverse the mapping of word_index so that we can use the index to find words:

index_word = dict([(value, key) for (key, value) in word_index.items()])

Now we can load IMDB data set into four lists: train_data, train_labels, test_data, and test_labels, using the entire vocabulary of 88584 words:

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=88584)

If your computer is not able to handle the large number of words, we can use the most frequent 10000 words in the vocabulary too:

(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

Now we can explore the data: train_labels and test_labels are just two list of class labels of 0 and 1, train_data and test_data are two lists of integers representing the indices of the words in each review. For example, train_data[5] is the fifth review in train_data and train_labels[5] is its classification.

We can use the following function to recover the actual review from the indices.

# recover text from word indices
def decode_text(indices):
    result = ''
    for i in indices:
        result = result + ' ' + index_word.get(i, '?')
    return result

Now, if we want to read the actual review in the first review:

txt = train_data[5]
decode_text(txt)

Now let us prepare the data for training. First, we will reserve first four indices for special words like as padding words to make all reviews of the same length and for the marker of the beginning of each review.

word_index = {k: (v + 3) for k, v in word_index.items()}
word_index[""] = 0
word_index[""] = 1
word_index[""] = 2
word_index[""] = 3

Because the above addition of four words into the word_index, we need to change decode_text function slightly:

def decode_text(indices):
result = ''
for i in indices:
    if i != 0:
        result = result + ' ' + index_word.get(i, '?')
return result

Second, we are going to pad all reviews to the length. To give us an idea, how many words we should pad to, let us find out what is the longest reviews in the data set using the following function:

def find_max_words(lstWords):
max = len(lstWords[0])
for i in range(len(lstWords)):
    if len(lstWords[i]) > max:
        max = len(lstWords[i])
return max

print(f' the longest review has {find_max_words(train_data)} words in training data')
print(f' the longest reivew has {find_max_words(test_data)} words in test data')

We found that the longest review is about 2500 words, and so we pad every review to 2500 words, which will add 0, or word , to the end of each review, padding="post", up to 2500 words:

train_data = keras.preprocessing.sequence.pad_sequences(train_data, value=word_index[""], padding='post', maxlen=2500)
test_data = keras.preprocessing.sequence.pad_sequences(test_data, value=word_index[""], padding='post', maxlen=2500)

Finally, let us create, train, and evaluate a model that is made of six layers: word embedding lay mapping 88584 input word nodes to 256 outputs, the second lay does convolutional pooling, the third layer made of 256 nodes using RELU activation, the fourth layer is a dropout for regularization, the fifth layer made of 256 nodes with RELU activation, and the final layer made of one node for predicting class label.

vocab_size = 88584

model = keras.Sequential([

    keras.layers.Embedding(vocab_size, 256),

    keras.layers.GlobalAveragePooling1D(),

    keras.layers.Dense(256, activation=tf.nn.relu),

    keras.layers.Dropout(0.2),

    keras.layers.Dense(256, activation=tf.nn.relu),

    keras.layers.Dense(1, activation=tf.nn.sigmoid)

])


model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['acc'])

# Split data into validation and train sets
x_val, y_val = train_data[:10000], train_labels[:10000]
partial_x_train, partial_y_train = train_data[10000:], train_labels[10000:]

# Start learning
model.fit(partial_x_train, partial_y_train, batch_size=512, epochs=200, validation_data=(x_val, y_val))

# evaluate model will produce both loss value and accuracy values:
results = model.evaluate(test_data, test_labels)
print(results)

As a way to demonstrate the performance of the model, we can print 100 test reviews and their predictions:

COUNT = 100
test_data_print, test_labels_print = test_data[:COUNT], test_labels[:COUNT]
predictions = model.predict(test_data_print)

for i in range(COUNT):
print("Review: " + decode_text(test_data_print[i]))
pred = predictions[i][0]
value = test_labels_print[i]
print("Prediction / Value: " + str(pred) + " / " + str(value))
if (pred < 0.5) != (value < 0.5):
    print('\x1b[1;31mIncorrect prediction')
else:
    print('\x1b[1;32mCorrect prediction')
    print('\x1b[0m')

 

GPU Compute Capability vs CUDA Toolkit Version

In the above installation, we installed CUDa Toolkit 11.2 and cuDNN version 8.1.0 for TensorFlow 2.10 in Python 3.9.  The choices are not arbitrary, and in fact, you need to decide them based on GPU card. Here is a simple guideline for making the choices:

1) For native windows, TensorFlow 2.10 is the last supported version, which works for Python 3.9 -- 3.11 only. So, the last python version (3.12 at this point) is not a choice.

2) To decide CUDA Toolkit version, use the following chart. Unless you have a very old GPU, version 11.8 is good for GPUs with compute capabilities between 3.5 and 9.0. Also, note that earlier versions of CUDA Toolkit may not recognize later versions of Visual Studio. To find out your GPU's compute capabilities, go to CUDA - Wikipedia. A list of CUDA Toolkits is listed here: CUDA Toolkit Archive | NVIDIA Developer.

Note: CUDA SDK 10.2 is the last official release for macOS, as support will not be available for macOS in newer releases.

3) cnDNN has to CUDA Toolkit, when you download cuDNN, you will be asked to choose the matching version of cuDNN based the version of CUDA Toolkit. A list of cuDNN libraries are available here: cuDNN Archive | NVIDIA Developer. Note you need to logon to NVidia to be able to download. 

Compute Capability (CUDA SDK support vs. Microarchitecture)
CUDA SDK
version(s)
TeslaFermiKepler
(early)
Kepler
(late)
MaxwellPascalVoltaTuringAmpereAda
Lovelace
Hopper
1.0[34] 1.0 – 1.1                    
1.1 1.0 – 1.1+x                    
2.0 1.0 – 1.1+x                    
2.1 - 2.3.1[35][36][37][38] 1.0 – 1.3                    
3.0 - 3.1[39][40] 1.0 2.0                  
3.2[41] 1.0 2.1                  
4.0 - 4.2 1.0 2.1                  
5.0 - 5.5 1.0     3.5              
6.0 1.0     3.5              
6.5 1.1       5.x            
7.0 - 7.5   2.0     5.x            
8.0   2.0       6.x          
9.0 - 9.2     3.0       7.0        
10.0 - 10.2     3.0         7.5      
11.0[42]       3.5         8.0    
11.1 - 11.4[43]       3.5         8.6    
11.5 - 11.7.1[44]       3.5         8.7    
11.8[45]       3.5             9.0
12.0 - 12.3         5.0           9.0

 

 


           Register

Blog    News    Post
 
     
 
Blog Posts    News Digest    Contact Us    About Developer    Privacy Policy

©1997-2024 ecourse.org. All rights reserved.