predict ¤
Prediction module, called after model training.
custom_predict ¤
custom_predict(X: Series, model: Pipeline, args: Namespace, y_true: Optional[ndarray] = None) -> tuple[ndarray, Namespace]
If the model has predict_proba
attribute, predict the probability of
each label occurring. Furthermore, if the true labels are given, use them
to tune the threshold for each class using train.tune_threshold.
Otherwise, if the model has no predict_proba
attribute, predict the
label directly (0 or 1) using 0.5 threshold.
Parameters:
-
X
(Series
) –Preprocessed posts.
-
model
(Pipeline
) –End-to-end pipeline including vectorizer and model.
-
args
(Namespace
) –Arguments containing booleans for preprocessing the posts and hyperparameters for the modeling pipeline. Can also contain the best threshold tuned for each class.
-
y_true
(Optional[ndarray]
, default:None
) –Ground truth (correct) target values. Defaults to None.
Returns:
-
y_pred
(ndarray
) –Estimated targets as returned by the model.
-
args
(Namespace
) –Arguments, either is the same as input arguments or additionally also contains the best threshold tuned for each class.
Source code in tagolym/predict.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
predict ¤
predict(texts: list[str], artifacts: dict[str, Any]) -> list[dict]
Load arguments, label binarizer, and the trained model. Then, preprocess given posts and predict their labels using custom_predict. The label binarizer is used to transform the prediction matrix back into readable labels.
Parameters:
-
texts
(list[str]
) –User input list of posts.
-
artifacts
(dict[str, Any]
) –Arguments, label binarizer, and the trained model.
Returns:
Source code in tagolym/predict.py
45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 |
|