Utils
promptolution.utils
Module for utility functions and classes.
callbacks
Callback classes for logging, saving, and tracking optimization progress.
BaseCallback
Bases: ABC
Base class for optimization callbacks.
Callbacks can be used to monitor the optimization process, save checkpoints, log metrics, or implement early stopping criteria.
Source code in promptolution/utils/callbacks.py
__init__(**kwargs)
Initialize the callback with a configuration.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config
|
Configuration for the callback. |
required | |
**kwargs
|
Additional keyword arguments. |
{}
|
on_epoch_end(optimizer)
Called at the end of each optimization epoch.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
optimizer
|
The optimizer object that called the callback. |
required |
Returns:
Name | Type | Description |
---|---|---|
Bool |
True if the optimization should continue, False if it should stop. |
Source code in promptolution/utils/callbacks.py
on_step_end(optimizer)
Called at the end of each optimization step.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
optimizer
|
The optimizer object that called the callback. |
required |
Returns:
Name | Type | Description |
---|---|---|
Bool |
True if the optimization should continue, False if it should stop. |
Source code in promptolution/utils/callbacks.py
on_train_end(optimizer)
Called at the end of the entire optimization process.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
optimizer
|
The optimizer object that called the callback. |
required |
Returns:
Name | Type | Description |
---|---|---|
Bool |
True if the optimization should continue, False if it should stop. |
Source code in promptolution/utils/callbacks.py
BestPromptCallback
Bases: BaseCallback
Callback for tracking the best prompt during optimization.
This callback keeps track of the prompt with the highest score.
Attributes:
Name | Type | Description |
---|---|---|
best_prompt |
str
|
The prompt with the highest score so far. |
best_score |
float
|
The highest score achieved so far. |
Source code in promptolution/utils/callbacks.py
__init__()
get_best_prompt()
Get the best prompt and score achieved during optimization.
Returns: Tuple[str, float]: The best prompt and score.
on_step_end(optimizer)
Update the best prompt and score if a new high score is achieved.
Args: optimizer: The optimizer object that called the callback.
Source code in promptolution/utils/callbacks.py
FileOutputCallback
Bases: BaseCallback
Callback for saving optimization progress to a specified file type.
This callback saves information about each step to a file.
Attributes:
Name | Type | Description |
---|---|---|
dir |
str
|
Directory the file is saved to. |
step |
int
|
The current step number. |
file_type |
str
|
The type of file to save the output to. |
Source code in promptolution/utils/callbacks.py
__init__(dir, file_type='parquet')
Initialize the FileOutputCallback.
Args: dir (str): Directory the CSV file is saved to. file_type (str): The type of file to save the output to.
Source code in promptolution/utils/callbacks.py
on_step_end(optimizer)
Save prompts and scores to csv.
Args: optimizer: The optimizer object that called the callback
Source code in promptolution/utils/callbacks.py
LoggerCallback
Bases: BaseCallback
Callback for logging optimization progress.
This callback logs information about each step, epoch, and the end of training.
Attributes:
Name | Type | Description |
---|---|---|
logger |
The logger object to use for logging. |
|
step |
int
|
The current step number. |
Source code in promptolution/utils/callbacks.py
__init__(logger)
on_step_end(optimizer)
Log information about the current step.
Source code in promptolution/utils/callbacks.py
on_train_end(optimizer, logs=None)
Log information at the end of training.
Args: optimizer: The optimizer object that called the callback. logs: Additional information to log.
Source code in promptolution/utils/callbacks.py
ProgressBarCallback
Bases: BaseCallback
Callback for displaying a progress bar during optimization.
This callback uses tqdm to display a progress bar that updates at each step.
Attributes:
Name | Type | Description |
---|---|---|
pbar |
tqdm
|
The tqdm progress bar object. |
Source code in promptolution/utils/callbacks.py
__init__(total_steps)
Initialize the ProgressBarCallback.
Args: total_steps (int): The total number of steps in the optimization process.
on_step_end(optimizer)
Update the progress bar at the end of each step.
Args: optimizer: The optimizer object that called the callback.
on_train_end(optimizer)
Close the progress bar at the end of training.
Args: optimizer: The optimizer object that called the callback.
TokenCountCallback
Bases: BaseCallback
Callback for stopping optimization based on the total token count.
Source code in promptolution/utils/callbacks.py
__init__(max_tokens_for_termination, token_type_for_termination)
Initialize the TokenCountCallback.
Args: max_tokens_for_termination (int): Maximum number of tokens which is allowed befor the algorithm is stopped. token_type_for_termination (str): Can be one of either "input_tokens", "output_tokens" or "total_tokens".
Source code in promptolution/utils/callbacks.py
on_step_end(optimizer)
Check if the total token count exceeds the maximum allowed. If so, stop the optimization.
Source code in promptolution/utils/callbacks.py
config
Configuration class for the promptolution library.
ExperimentConfig
Configuration class for the promptolution library.
This is a unified configuration class that handles all experiment settings. It provides validation and tracking of used fields.
Source code in promptolution/utils/config.py
__getattribute__(name)
Override attribute access to track used attributes.
Source code in promptolution/utils/config.py
__init__(**kwargs)
Initialize the configuration with the provided keyword arguments.
__setattr__(name, value)
Override attribute setting to track used attributes.
Source code in promptolution/utils/config.py
apply_to(obj)
Apply matching attributes from this config to an existing object.
Examines each attribute of the target object and updates it if a matching attribute exists in the config.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
obj
|
The object to update with config values |
required |
Returns:
Type | Description |
---|---|
The updated object |
Source code in promptolution/utils/config.py
validate()
Check if any attributes were not used and run validation.
Does not raise an error, but logs a warning if any attributes are unused or validation fails.
Source code in promptolution/utils/config.py
logging
Logging configuration for the promptolution library.
get_logger(name, level=None)
Get a logger with the specified name and level.
This function provides a standardized way to get loggers throughout the library, ensuring consistent formatting and behavior.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the logger, typically name of the module. |
required |
level
|
int
|
Logging level. Defaults to None, which uses the root logger's level. |
None
|
Returns:
Type | Description |
---|---|
Logger
|
logging.Logger: Configured logger instance. |
Source code in promptolution/utils/logging.py
setup_logging(level=logging.INFO)
Set up logging for the promptolution library.
This function configures the root logger for the library with appropriate formatting and level.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
level
|
int
|
Logging level. Defaults to logging.INFO. |
INFO
|
Source code in promptolution/utils/logging.py
prompt_creation
Utility functions for prompt creation.
create_prompt_variation(prompt, llm, meta_prompt=None)
Generate a variation of the given prompt(s) while keeping the semantic meaning.
Idea taken from the paper Zhou et al. (2021) https://arxiv.org/pdf/2211.01910
Parameters:
Name | Type | Description | Default |
---|---|---|---|
prompt
|
Union[List[str], str]
|
The prompt(s) to generate variations of. |
required |
llm
|
BaseLLM
|
The language model to use for generating the variations. |
required |
meta_prompt
|
str
|
The meta prompt to use for generating the variations. |
None
|
Returns:
Type | Description |
---|---|
List[str]
|
List[str]: A list of generated variations of the input prompt(s). |
Source code in promptolution/utils/prompt_creation.py
create_prompts_from_samples(task, llm, meta_prompt=None, n_samples=3, task_description=None, n_prompts=1, get_uniform_labels=False)
Generate a set of prompts from dataset examples sampled from a given task.
Idea taken from the paper Zhou et al. (2021) https://arxiv.org/pdf/2211.01910 Samples are selected, such that (1) all possible classes are represented (2) the samples are as representative as possible
Parameters:
Name | Type | Description | Default |
---|---|---|---|
task
|
BaseTask
|
The task to generate prompts for. |
required |
llm
|
BaseLLM
|
The language model to use for generating the prompts. |
required |
meta_prompt
|
str
|
The meta prompt to use for generating the prompts. |
None
|
n_samples
|
int
|
The number of samples to use for generating prompts. |
3
|
task_description
|
str
|
The description of the task to include in the prompt. |
None
|
n_prompts
|
int
|
The number of prompts to generate. |
1
|
get_uniform_labels
|
bool
|
If True, samples are selected such that all classes are represented. |
False
|
Returns:
Type | Description |
---|---|
List[str]
|
List[str]: A list of generated prompts. |
Source code in promptolution/utils/prompt_creation.py
test_statistics
Implementation of statistical significance tests used in the racing algorithm. Contains paired t-test functionality to compare prompt performance and determine statistical significance between candidates.
get_test_statistic_func(name)
Get the test statistic function based on the name provided.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
name
|
str
|
Name of the test statistic function. |
required |
Returns:
Name | Type | Description |
---|---|---|
callable |
callable
|
The corresponding test statistic function. |
Source code in promptolution/utils/test_statistics.py
paired_t_test(scores_a, scores_b, alpha=0.05)
Uses a paired t-test to test if candidate A's accuracy is significantly higher than candidate B's accuracy within a confidence interval of 1-lpha. Assumptions: - The samples are paired. - The differences between the pairs are normally distributed (-> n > 30).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
scores_a
|
ndarray
|
Array of accuracy scores for candidate A. |
required |
scores_b
|
ndarray
|
Array of accuracy scores for candidate B. |
required |
alpha
|
float
|
Significance level (default 0.05 for 95% confidence). |
0.05
|
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if candidate A is significantly better than candidate B, False otherwise. |
Source code in promptolution/utils/test_statistics.py
token_counter
Token counter for LLMs.
This module provides a function to count the number of tokens in a given text.
get_token_counter(llm)
Get a token counter function for the given LLM.
This function returns a callable that counts tokens based on the LLM's tokenizer or a simple split method if no tokenizer is available.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
llm
|
The language model object that may have a tokenizer. |
required |
Returns:
Type | Description |
---|---|
A callable that takes a text input and returns the token count. |