Launch a Fine-Tuning Job

This guide walks you through the process of fine-tuning a large language model using MonsterTuner - a no code scalable LLM fine-tuner

Access the Fine-Tuning Portal

After logging into your account, open the "Fine-Tuning" Portal from left side menu.

Create a New Fine-Tuning Job

Click on the "Create New Job" button located within the portal.

First Step: Specify Job Name and Select an LLM

Next, specify a unique job name and select an LLM model that you wish to fine-tune from the drop-down menu.

You can choose from the latest LLMs such as Llama 7B, GPT-J 6B or StableLM 7B.

Second Step: Select or Create a Task and Dataset

Select a Fine-Tuning Task:

Select a task for fine-tuning the LLM. This could be "Instruction Fine-Tuning", "Text Classification", or any other task of your choice. If your task is not listed, select the "Other" option and specify your custom task and update the below prompt configuration to match the task at hand.

Select a Dataset

While selecting a dataset to be used for finetuning, user has three options:

Option 1 - Select a curated Hugging Face Dataset:

Choose from our curated selection of mostly used hugging face datasets with predefined training prompt configuration. If the chosen dataset has subsets they can be selected from the 'choose a Subset' dropdown. This dropdown becomes visible after you have chosen a particular dataset.

Option 2 - Specify an unlisted Hugging Face Dataset:

If you are unable to find a Hugging Face dataset of your choice in our curated list then you may choose 'other' option and provide dataset name for the Hugging Face dataset to fetch and use.

If the Hugging Face dataset has subsets enabled you can choose the subset from `choose a Subset` dropdown. if the dataset has no subsets, default subset is automatically selected.

Unlike the case of pre-curated datasets, you'd now have to specify the Prompt Configuration in below text section to replace the relevant column names in square brackets with appropriate column names in the dataset.

Example: If kz919/alpaca dataset is fetched through 'other' option.

We replace suggested prompt:

###Instruction:[replace with Instruction Column Name]

###Response:[replace with Response Column Name]

as follows:

###Instruction:[prompt]

###Response:[completion]

Option 3 - Select your Managed Dataset:

You can use your datasets for finetuning as well.

Refer to our documentation on Managed Datasets for details on how to upload a custom dataset.

If you have already uploaded a dataset to MonsterAPI platform via "Manged Datasets" page, then you will be able to choose your dataset from "My Datasets" dropdown section. This section will appear automatically only if you have already uploaded datasets.

Unlike the case of pre-curated datasets, you'd now have to specify the Prompt Configuration in below text section to replace the relevant column names in square brackets with appropriate column names in your dataset.

Example:

Let us say we uploaded a dataset named as novel_plum_alpaca which is a csv file with 2 columns in it, named as "prompt" and "completion".

When we select this dataset from My Datasets dropdown, we are displayed a Text Area named as "Prompt Configuration". We need to specify our target column names in that configuration to help the finetuner use our dataset properly for finetuning.

We replace the suggested prompt configuration:

###Instruction:[replace with Instruction Column Name]

###Response:[replace with Response Column Name]

with this:

###Instruction:[prompt]

###Response:[completion]

since 'prompt' is the instruction column in our dataset and 'completion' is the response column name in our dataset.

That's it with Dataset preparation step. We can now move to the next step!

Third Step: Training Hyperparameter Configuration.

In the next step, set your hyper-parameters such as epochs, learning rate, cutoff length, warmup steps, and so on. These parameters are automatically filled based on your chosen model, but you can modify them according to your needs.

Huggingface and WandB credentials can be provided to upload the model into Huggingface and record training logs into WandB.

Please note: These parameters affect the fine-tuning process and can also lead to failure if not set correctly.

Final Step: Review and Submit Job

Click on Next to proceed to the summary page. Review the final job summary to ensure all the settings are correct, then submit your request.

Optional Settings:

1. Track your Fine-Tuning job using WandB (Optional)

To track your fine-tuning run, you may add your WandB credentials on Third step, before job submission:

  • Your username,

  • Your WandB key, and

  • Your project name

If you add these credentials, the job will automatically send metrics to your WandB project so you can track the experiments.

2. Upload model outputs to Huggingface Repo (Optional)

If you want to store the final fine-tuned model weights in a HuggingFace repository, add your HuggingFace credentials on Third step, before job submission:

  • Your Huggingface API Key (Must have write access)

  • Your Huggingface Repo Path

If you add these credentials, the job will automatically publish the fine-tuned weights to your huggingface repo upon completion.

That's it! Fin!

Last updated