Technow: DrEureka, Nvidia llama3-ChatQA, bootstrapped LLaMa-3 120B
Meet DrEureka, an LLM agent that trains robots in simulation, and Nvidia’s Llama3-ChatQA-1.5, excelling in conversational question answering. Also, Maxime Labonne’s Meta-Llama-3-120B-Instruct merges multiple instances to enhance model capabilities. These innovations are shaping the future of AI!
DrEureka: an LLM agent that writes code to train robot skills in simulation
The LLM agent automates coding for training robot skills in simulation and bridging the simulation-reality gap. It enables a robot dog to balance and walk on a yoga ball in simulation and transfer those skills to the real world without fine-tuning. It uses GPT-4’s physical intuition to tune parameters like friction and gravity, surpassing traditional domain randomization methods.
Nvidia llama3-ChatQA
Llama3-ChatQA-1.5, which excels at conversational question answering (QA) and retrieval-augmented generation (RAG). Llama3-ChatQA-1.5 is developed using an improved training recipe from ChatQA paper, and it is built on top of Llama-3 base model. Specifically, we incorporate more conversational QA data to enhance its tabular and arithmetic calculation capability. Llama3-ChatQA-1.5 has two variants: Llama3-ChatQA-1.5-8B and Llama3-ChatQA-1.5-70B. Both models were originally trained using Megatron-LM, we converted the checkpoints to Hugging Face format. For more information about ChatQA, check the website!
bootstrapped LLaMa-3 120B
Maxime Labonne created the Meta-Llama-3-120B-Instruct by merging multiple instances of the Meta-Llama-3-70B-Instruct using MergeKit.
This method, called “self-merge,” scales the model from 70 billion to 120 billion parameters. Labonne structured the merger by overlapping layer ranges from 0 to 80, enhancing the model’s capabilities.
He employed a “passthrough” merging technique, maintaining data type as float16 to optimize performance.
Performance:
Llama 3 120B gets the 6th place on the Creative Writing benchmark
It outperforms LLaMa3 70B, although quite inefficiently (+50B param for a mere +1.5 points)
Despite its prowess in creative writing, the model underperforms in reasoning tasks compared to other models like GPT-4
Application:
Use this model for creative writing, it uses the Llama 3 chat template with a default context window of 8K.
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "mlabonne/Llama-3-120B"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer =
AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages,
tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True,
temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])