Richard Kuo

Generative AI - sample codes


Text-to-Text (LLMs)

LLM prompting


LLM Server & Client


Colab’s LLM Server & Client


Ollama

ollama list
ollama run llama3.1

ollama chat/generate

ollama speak


Audio-to-Text

local ASR+LLM Server (on your PC+GPU)

  1. run server on local PC (on your PC+GPU): python whisper_llm_server.py
  2. Generate audio file: python ../gTTS.py "Hello, how are you?" en
  3. Post Audio to Server: python post_audio.py

Colab ASR+LLM Server (on Colab T4)

  1. Open colab to run pyngrok_Whisper_LLM_Server.ipynb on Colab T4
  2. Generate audio file: python ../gTTS.py "Hello, how are you?" en
  3. Post Audio to Server: python post_audio.py

Image-to-Text (VLM)

VLM servers

For running server, (use one of the following)

  1. python llava_server.py
  2. python llava_next_server.py
  3. python phi3-vision_server.py

For running client, (post image & text to VLM server)
python post_imgtxt.py images/barefeet1.jpg

ASR + VLM servers

  1. python whisper_llava_server.py
  2. python ../gTTS.py "這是什麼有名的台南美食?" zh (TTS)
  3. python post_imgau.py (client)


Text-to-Speech


Text-to-Image


Image-to-3D

TripoSR

Text-to-3D

gTranslate + SDXL-Lightning + TripoSR + AppInventor2