evaluate ¤
Given true labels and model predictions, the purpose of this module is to calculate the precision, recall, f1 score of the model, and number of samples. The performance is computed on the overall samples, per-class samples, and per-slice samples. There are 8 slices considered:
- short tokens, i.e. those that have less than 5 words,
- six slices in which the posts are tagged as a subtopic but not tagged as the bigger topic covering the subtopic, and
- tokens that don't have frequent words with more than 3 letters.
short_post ¤
short_post(x: Series) -> bool
Confirm whether a data point has a token with less than 5 words.
Parameters:
-
x
(Series
) –Data point containing a token.
Returns:
-
bool
–Whether the data point has a token with less than 5 words.
Source code in tagolym/evaluate.py
17 18 19 20 21 22 23 24 25 26 27 |
|
inequality_not_algebra ¤
inequality_not_algebra(x: Series) -> bool
Confirm whether a data point has "inequality"
but not "algebra"
as
one of its labels.
Parameters:
-
x
(Series
) –Data point containing a list of labels.
Returns:
-
bool
–Whether the data point has
"inequality"
but not"algebra"
as one of its labels.
Source code in tagolym/evaluate.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
|
function_not_algebra ¤
function_not_algebra(x: Series) -> bool
Confirm whether a data point has "function"
but not "algebra"
as
one of its labels.
Parameters:
-
x
(Series
) –Data point containing a list of labels.
Returns:
-
bool
–Whether the data point has
"function"
but not"algebra"
as one of its labels.
Source code in tagolym/evaluate.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
|
polynomial_not_algebra ¤
polynomial_not_algebra(x: Series) -> bool
Confirm whether a data point has "polynomial"
but not "algebra"
as
one of its labels.
Parameters:
-
x
(Series
) –Data point containing a list of labels.
Returns:
-
bool
–Whether the data point has
"polynomial"
but not"algebra"
as one of its labels.
Source code in tagolym/evaluate.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 |
|
circle_not_geometry ¤
circle_not_geometry(x: Series) -> bool
Confirm whether a data point has "circle"
but not "geometry"
as one
of its labels.
Parameters:
-
x
(Series
) –Data point containing a list of labels.
Returns:
-
bool
–Whether the data point has
"circle"
but not"geometry"
as one of its labels.
Source code in tagolym/evaluate.py
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 |
|
trigonometry_not_geometry ¤
trigonometry_not_geometry(x: Series) -> bool
Confirm whether a data point has "trigonometry"
but not "geometry"
as one of its labels.
Parameters:
-
x
(Series
) –Data point containing a list of labels.
Returns:
-
bool
–Whether the data point has
"trigonometry"
but not"geometry"
as one of its labels.
Source code in tagolym/evaluate.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|
modular_arithmetic_not_number_theory ¤
modular_arithmetic_not_number_theory(x: Series) -> bool
Confirm whether a data point has "modular arithmetic"
but not
"number theory"
as one of its labels.
Parameters:
-
x
(Series
) –Data point containing a list of labels.
Returns:
-
bool
–Whether the data point has
"modular arithmetic"
but not"number theory"
as one of its labels.
Source code in tagolym/evaluate.py
115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|
keyword_lookup ¤
keyword_lookup(x: Series, keywords: list) -> bool
Confirm whether a token of a data point doesn't have frequent words with more than 3 characters.
Parameters:
-
x
(Series
) –Data point containing a token.
-
keywords
(list
) –Frequent four-letter-or-more words derived from all tokens.
Returns:
-
bool
–Whether the token of the data point doesn't have frequent words with more than 3 letters.
Source code in tagolym/evaluate.py
132 133 134 135 136 137 138 139 140 141 142 143 144 145 |
|
make_keyword_sf ¤
make_keyword_sf(df: DataFrame) -> SlicingFunction
Create a SlicingFunction
object to use the keyword_lookup function.
Parameters:
-
df
(DataFrame
) –Preprocessed data containing tokens and their corresponding labels.
Returns:
-
SlicingFunction
–Python class for slicing functions, i.e. functions that take a data point as input and produce a boolean that states whether or not the data point satisfies some predefined conditions.
Source code in tagolym/evaluate.py
148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
|
average_performance ¤
average_performance(y_true: ndarray, y_pred: ndarray, average: Optional[Literal[micro, macro, weighted]] = 'weighted') -> dict[str, Union[float, int]]
Compute precision, recall, F-measure, and number of samples from model predictions and true labels.
Parameters:
-
y_true
(ndarray
) –Ground truth (correct) target values.
-
y_pred
(ndarray
) –Estimated targets as returned by the model.
-
average
(Optional[Literal]
, default:'weighted'
) –If
None
, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:Average Description "micro"
Calculate metrics globally by counting the total true positives, false negatives and false positives. "macro"
Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account. "weighted"
Calculate metrics for each label, and find their average weighted by support (the number of true instances for each label). This alters "macro"
to account for label imbalance; it can result in an F-score that is not between precision and recall.Defaults to
"weighted"
.
Returns:
-
dict[str, Union[float, int]]
–Dictionary containing precision, recall, F-measure, and number of samples.
Source code in tagolym/evaluate.py
170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 |
|
get_slice_metrics ¤
get_slice_metrics(y_true: ndarray, y_pred: ndarray, slices: ndarray) -> dict[str, dict]
Apply average_performance with
"micro"
average to different slices of data.
Parameters:
-
y_true
(ndarray
) –Ground truth (correct) target values.
-
y_pred
(ndarray
) –Estimated targets as returned by the model.
-
slices
(ndarray
) –Slices of data defined by slicing functions.
Returns:
Source code in tagolym/evaluate.py
211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 |
|
get_metrics ¤
get_metrics(y_true: ndarray, y_pred: ndarray, classes: ndarray, df: Optional[DataFrame] = None) -> dict[str, dict]
Compute model performance for the overall data (using "weighted" average), across classes, and across slices (using "micro" average).
Parameters:
-
y_true
(ndarray
) –Ground truth (correct) target values.
-
y_pred
(ndarray
) –Estimated targets as returned by the model.
-
classes
(ndarray
) –Complete labels.
-
df
(Optional[DataFrame]
, default:None
) –Preprocessed data containing tokens and their corresponding labels. Defaults to None.
Returns:
-
dict[str, dict]
–Dictionary containing dictionaries of average performances for the overall data, across classes, and across slices.
Source code in tagolym/evaluate.py
233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 |
|