alpenglow package¶
Subpackages¶
- alpenglow.evaluation package
- alpenglow.experiments package
- Submodules
- alpenglow.experiments.AsymmetricFactorExperiment module
- alpenglow.experiments.BatchAndOnlineFactorExperiment module
- alpenglow.experiments.BatchFactorExperiment module
- alpenglow.experiments.FactorExperiment module
- alpenglow.experiments.NearestNeighborExperiment module
- alpenglow.experiments.PersonalPopularityExperiment module
- alpenglow.experiments.PopularityExperiment module
- alpenglow.experiments.PopularityTimeframeExperiment module
- alpenglow.experiments.SvdppExperiment module
- alpenglow.experiments.TransitionProbabilityExperiment module
- Module contents
- alpenglow.offline package
- alpenglow.utils package
Submodules¶
alpenglow.Getter module¶
-
class
alpenglow.Getter.
Getter
[source]¶ Bases:
object
Responsible for creating and managing cpp objects in the
alpenglow.cpp
package.-
collect_
= False¶
-
items
= []¶
-
-
class
alpenglow.Getter.
MetaGetter
(a, b, c)[source]¶ Bases:
type
Metaclass of
alpenglow.Getter.Getter
. Provides utilities for creating and managing cpp objects in thealpenglow.cpp
package. For more information, see Memory management.
alpenglow.OnlineExperiment module¶
-
class
alpenglow.OnlineExperiment.
OnlineExperiment
(seed=254938879, top_k=100)[source]¶ Bases:
alpenglow.ParameterDefaults.ParameterDefaults
This is the base class of every online experiment in Alpenglow. It builds the general experimental setup needed to run the online training and evaluation of a model. It also handles default parameters and the ability to override them when instantiating an experiment.
Subclasses should implement the
config()
method; for more information, check the documentation of this method as well.Online evaluation in Alpenglow is done by processing the data row-by-row and evaluating the model on each new record before providing the model with the new information.
Evaluation is done by ranking the next item on the user’s toplist and saving the rank. If the item is not found in the top
top_k
items, the evaluation step returnsNaN
.For a brief tutorial on using this class, see Five minute tutorial.
Parameters: - seed (int) – The seed to initialize RNG-s. Should not be 0.
- top_k (int) – The length of the toplists.
-
get_predictions
()[source]¶ If the
calculate_toplists
parameter is set when callingrun
, this method can used to acquire the generated toplists.Returns: DataFrame containing the columns record_id, time, user, item, rank and prediction. - record_id is the index of the record begin evaluated in the input DataFrame. Generally, there are
top_k
rows with the same record_id. - time is the time of the evaluation
- user is the user the toplist is generated for
- item is the item of the toplist at the rank place
- prediction is the prediction given by the model for the (user, item) pair at the time of evaluation.
Return type: pandas.DataFrame - record_id is the index of the record begin evaluated in the input DataFrame. Generally, there are
-
run
(data, experimentType=None, columns={}, verbose=True, out_file=None, lookback=False, initialize_all=False, max_item=-1, max_user=-1, calculate_toplists=False)[source]¶ Parameters: - data (pandas.DataFrame or str) – The input data, see Five minute tutorial. If this parameter is a string, it has to be in the format specified by
experimentType
. - experimentType (str) – The format of the input file if
data
is a string - columns (dict) – Optionally the mapping of the input DataFrame’s columns’ names to the expected ones.
- verbose (bool) – Whether to write information about the experiment while running
- out_file (str) – If set, the results of the experiment are also written to the file located at
out_file
. - lookback (bool) – If set to True, a user’s previosly seen items are excluded from the toplist evaluation. The
eval
columns of the input data should be set accordingly. - calculate_toplists (bool or list) – Whether to actually compute the toplists or just the ranks (the latter is faster). It can be specified on a record-by-record basis, by giving a list of booleans as parameter. The calculated toplists can be acquired after the experiment’s end by using
get_predictions
.
Returns: Description of return value
Return type: bool
- data (pandas.DataFrame or str) – The input data, see Five minute tutorial. If this parameter is a string, it has to be in the format specified by