AI 教材
AIGC 教材
GenAI-projects 教材
範例程式: git clone https://github.com/rkuo2000/GenAI
<img width="50%" height="50%" src="https://github.com/rkuo2000/GenAI/raw/main/assets/Tensor.Art_Flux_girl.png"
ComfyUI Now Supports Stable Diffusion 3.5!
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
mv ~/Downloads/flux1-dev-fp8.safetensors ~/ComfyUI/models/unet/
mv ~/Downloads/t5xxl_fp8_e4m3fn.safetensors ~/ComfyUI/models/clip/
mv ~/Downloads/clip_l.safetensors ~/ComfyUI/models/clip/
mv ~/Downloads/ae.safetensors ~/ComfyUI/models/vae/
python main.py
open Browser at http:127.0.0.1:8188
drag flux_dev_fp8_example.png to browser window to generate the work-flow chart
CLIP Text Encode (Positive Prompt)
Queue Prompt
to generate imagepretty Asian woman was holding the flowers in her hands, Korean Model, real photo style, full body shot.
One girl, long hair, model, white background, white shirt, khaki Capri pants, khaki loafers, sitting on a stool, lazy pose, slightly tilting head, smiling, Asian beauty, loose-ting clothes, inting clothes , slightly raised foot, half-body shot, Canon R5 camera style, blurred background, indoor, natural light, some sunlight shining on the face,9 : 16.
A modern office building design with 6 floors. The design language of the building is organic volume, curve design elements, natural leave or flower symbols.
7z x webui_forge_cu124_torch24.7z
mv webui_forge_cu124_torch24 WebUI-Forge
./webui.sh
gTranslate + SDXL-Lightning + TripoSR + Blender
Kaggle: https://www.kaggle.com/code/rkuo2000/triposr
Code: https://github.com/apple/ml-depth-pro
Kaggle: https://www.kaggle.com/code/rkuo2000/depth-pro
SV4D
SV4D was trained to generate 40 frames (5 video frames x 8 camera views) at 576x576 resolution
<img width="50%" height="50%" src="https://github.com/rkuo2000/GenAI/raw/main/assets/ImagineArt_flying_cat_wearing_superman_suit.png"
ComfyUI-MuseTalk
<video src=https://github.com/TMElyralab/MuseTalk/assets/163980830/b2a879c2-e23a-4d39-911d-51f0343218e4 controls preload></video>
python gTTS.py "How are you" en
: generate gTTS.mp3python gT2T.py "How are you" fr
: deep-translatorpython gSpeak.py "How are you" fr
: deep-translator, gTTS & Mpg123python parler.py
python bark_en.py
, python bark_cn.py
python coqui_en.py
, python coqui_zh.py
python text_to_speech.py
python gTTS.py "你好?" zh
python gTranslate.py
Blog: 語音辨識API
python whisper_llm_server.py
python ../gTTS.py "Hello, how are you?" en
python post_audio.py
Large Language Models 教材
Prompt Engineering 教材
git clone https://github.com/rkuo2000/GenAI
cd GenAI/Text-to-Text
python gpt4free.py
(gpt-3.5-turbo)python gpt4all_prompting.py
python LLM_prompting.py
python llm_server.py
(on GPU)python post_text.py
(on PC)python ../gTTS.py "Hello, how are you?" en
python post_audio.py
ollama list
ollama run llama3.2
python ollama_chat.py
python ollama_stream.py
(print text in streaming mode)python ollama_curl.py
python ollama_speak.py
(ollama generated text, gTTS to speech, then mpg123 to speak)python ollama_speak_t2t.py
(ollama generated text, gTTS to speech, deep-translator to zh-TW, mpg123 to speak)MIT App Inventor 2 example for using Google Gemini
Download Gemini_Talk.aia , import to [ai2.mit](https://ai2.appinventor.mit.edu/)
Get API Key and put into the blank
Build apk, download & install to run on smartphone
(三星手機使用三星文字轉語音引擎應用程式, 語言設繁體中文會講不出話, 要改成簡體中文, 或使用英文)
fine-tune-gemma-7b-it-for-sentiment-analysis
fine-tune-llama-3-for-sentiment-analysis
fine-tune-gemma-models-in-keras-using-lora
For running server, (use one of the following)
python llava_server.py
python llava_next_server.py
python phi3-vision_server.py
For running client, (post image & text to VLM server)
python post_imgtxt.py images/barefeet1.jpg
python whisper_llava_server.py
python ../gTTS.py "這是什麼有名的台南美食?" zh
(TTS)python post_imgau.py
(client)python gemini_image.py
python gemini_jpg2csv.py
Kaggle: rkuo2000/swarm-llama3-groq
Colab: colab_Swarm_Llama3_Groq.ipynb