I tried to run GPT-J in Windows 11 GPU environment.

Hello all, I’ve been working on a GPT-J based on GPT-3. This is a note on how I ran EleutherAI’s GPT-J, which is based on GPT-3, in my environment using mesh-transformer-jax.


Also, this time I’ll be running it in the Docker Desktop environment on Windows 11, as described in the following article. The GPU we are using is NVIDIA RTX 3090.


Setting up an environment with a TensorFlow container

This time we will use NVIDIA’s TensorFlow container. clone mesh-transformer-jax and create a folder to work in as folder D:\work\gpt-j.
To start the TensorFlow container, we need to set up port forwarding and mount the folder to use Jupyter Notebook.

docker run --gpus all -it -p 8888:8888 -v D:\work\gpt-j:/gpt-j nvcr.io/nvidia/tensorflow:21.12-tf2-py3 bash

Once started, clone mesh-transformer-jax.

cd /gpt-j
git clone https://github.com/kingoflolz/mesh-transformer-jax.git
cd mesh-transformer-jax

For some reason, the tensorflow package written in this requirements.txt specifies the cpu version, so modify the requirements.txt to specify the normal tensorflow package. Also, install the jaxlib that supports cuda.

sed -e 's/tensorflow-cpu/tensorflow.a/' requirements.txt > new_requirements.txt
pip install -r new_requirements.txt
pip install jax==0.2.12
pip install jaxlib==0.1.68+cuda111 -f https://storage.googleapis.com/jax-releases/jax_releases.html

Next, download the parameters for the slim version of GPT-J-6B and extract them.

wget -c https://mystic.the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd
tar -I zstd -xf step_383500_slim.tar.zstd

Check the operation of GPT-J.

Run resharding_example.py to see if GPT-J works.

sed -e 's/infer("EleutherAI is")/print(infer("EleutherAI is"))/' resharding_example.py > resharding_example2.py
python resharding_example2.py

Execution results (excerpt)

completion done in 98.40319633483887s
[' a single player, strategy game ....

The first inference takes a while, but it works if the output is something like “EleutherAI is” and the string after that.

I’ll try different things.

When trying other prompts, Jupyter Notebook is more convenient. Let’s try it by rewriting the infer in the last line of resharding_example2.py created above.

top_p = 0.9
temp = 1

context = '''私は真実を答える賢い質問応答ボットです。 
Q: 日本の人口は?
A: 1.2億人です。
Q: 世界で一番人口が多い国は?
A: '''

print(infer(top_p=top_p, temp=temp, gen_len=64, context=context)[0])

Execution results (excerpt)

Q: 日本の人口は?
A: 1.2億人です。
Q: 世界で一番人口が多い国は?
completion done in 9.850934267044067s
Q: 欧州で人口が多い国は?
A: 英国です。
Q: 経済力が優れている国は?

at the end

It’s fun to run GPT-J in my environment. The memory used is just barely enough even with the RTX3090, so it might be better to choose a GPU with more memory.

This time, I tested it on an HP gaming PC as an experiment, but if you want to run Azure, etc., the following VM may be better.

  • Standard_ND40rs_v2 V100 32
  • Standard_ND96asr_v4 A100 40
  • Standard_ND96asr_v4 A100 80

Azure OpenAI is still in preview status, so I hope it will be available for general use soon.



How to build a DeepLearning environment using WSL, Docker Desktop and GPUs in Windows 11.

How to build a DeepLearning environment using WSL, Docker Desktop and GPUs in Windows 11.

In this article, I will show you how to create a Deep Learning environment for Windows 11.

We will show you how to use NVIDIA GPUs from containers. Now you can quickly get a Deep Learning environment by quickly launching a public container for DeepLearning. Yay!

Install the driver to use CUDA with WSL.

First, download and install the NVIDIA driver for using CUDA with WSL from the following URL.


You will be able to use CUDA from WSL.

Install Docker Desktop

Next, download Docker Desktop from the following URL and install it.


Now you’re all set!
Docker Desktop now supports nvidia-docker, so you can use CUDA from the container. The following URL is an introductory article.


Let’s see if CUDA is available on Ubuntu.

Let’s start NVIDIA’s CUDA container and run nvidia-smi. This container is useful if you want to install TensorFlow or PyTorch on your own.

command (you can run it from PowerShell or Command. The following specifies Ubuntu 20.04 with CUDA 11.6 installed)

docker run -it --gpus=all --rm nvidia/cuda:11.6.0-base-ubuntu20.04 nvidia-smi

Execution result

| NVIDIA-SMI 510.00       Driver Version: 510.06       CUDA Version: 11.6     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
| 30%   30C    P8    31W / 350W |   1187MiB / 24576MiB |     N/A      Default |
|                               |                      |                  N/A |

| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|  No running processes found                                                 |

Some frameworks support different versions of CUDA, so you can use the version of CUDA you want to use simply by specifying it in the container tag.
There is no need to reinstall CUDA and cuDNN anymore. You don’t need to reinstall CUDA and cuDNN.


Let’s check if CUDA can be used with TensorFlow.

TensorFlow is also just used from a container. No installation is required.

command (run it from PowerShell or Command. The following specifies the latest container)

docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu python -c "from tensorflow.python.client import device_lib; import os; os.environ['TF_CPP_MIN_LOG_LEVEL']='1'; print(device_lib.list_local_devices())"

Execution result

[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
incarnation: 7510238269992894144
xla_global_id: -1
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 22681550848
locality {
  bus_id: 1
  links {
incarnation: 3708138668520980037
physical_device_desc: "device: 0, name: NVIDIA GeForce RTX 3090, pci bus id: 0000:01:00.0, compute capability: 8.6"
xla_global_id: 416903410

You can specify the version by referring to the following URL.


You can also use NGC’s TensorFlow provided by NVIDIA. tf1 is Version 1 series, and tf2 is Version 2 series.


Let’s see if we can use CUDA with PyTorch.

PyTorch is also just used from a container. No installation is required.

Command (run from PowerShell or Command. (The following specifies the NGC container)

docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:21.12-py3 python -c "import torch; print('version={}\ncuda is available={}\ncuda device count={}'.format(torch.__version__, torch.cuda.is_available(), torch.cuda.device_count()))"

Execution results (excerpt)

cuda is available=True
cuda device count=1

You can specify the version by referring to the following URL


To actually run the program

If you are familiar with containers, you probably don’t need any explanation, but it is recommended that you mount a Windows folder when starting a container. Edit the program in the mounted folder.

Using Jupyter Notebook

If you are using Jupyter Notebook, you may want to set up port forwarding as well.

In the following TensorFlow startup, we are running bash. We also set up a mount of the Windows D:\work folder to the container’s /work, and forward the container’s 8888 port to the Windows 8888 port.

docker run --gpus all -it -p 8888:8888 -v D:\work:/work nvcr.io/nvidia/tensorflow:21.12-tf2-py3 bash

After launching, you can access Jupyter Notebook in your Windows browser by referring to the following URL that appears after launching Jupyter Notebook.

jupyter notebook


    Or copy and paste this URL:

When accessing the site with a browser, where it says hostname, change it to localhost.

Using Visual Studio Code

VSCode’s Remote Developent (Remote - Contaners) plugin allows you to attach to a container from Visual Studio Code and use file editing, terminal and debugging.
If you only want to edit files, you don’t need to use this plugin. You can edit the mounted files directly.

Once the plugin is installed, click on the bottom left and select Attach to Container.


I can now run packages that were only available for Linux without difficulty.
WSL, CUDA on WSL, and Docker Desktop are the best!