I had a few bugs (using wrong variable name), and realized I never got yelled at for providing bad feature names.
A few observations:
- feature name length doesn't have to match features.
there can be too many (x, y, z and an additional "DNE" name)
ds = RegrDataset()
ds.descritpion="extra of feauture names"
ds.add_samplet('id1', target=100, features=[1,2,3], feature_names=['x','y','z'])
ds.add_samplet('id2', target=200, features=[4,5,6], feature_names=['x','y','z','DNE'])
(x, _, _) = ds.data_and_targets()
print(ds.feature_names)
print(x)
['x' 'y' 'z' 'DNE']
[[1. 2. 3.]
[4. 5. 6.]]
or too few (only x, but have x, y, and z)
ds = RegrDataset()
ds.descritpion="extra of feauture names"
ds.add_samplet('id1', target=100, features=[1,2,3], feature_names=['x'])
ds.add_samplet('id2', target=200, features=[6,5,4], feature_names=['x'])
[x, _, _] = ds.data_and_targets()
print(ds.feature_names)
print(x)
['x']
[[1. 2. 3.]
[6. 5. 4.]]
- specifying feature names for one samplet changes names everywhere?
ds = RegrDataset()
ds.descritpion="extra of feauture names"
ds.add_samplet('id1', target=100, features=[1,2,3], feature_names=['x','y','z'])
ds.add_samplet('id2', target=200, features=[4,5,6], feature_names=['y','y','z'])
[x, _, _] = ds.data_and_targets()
print(ds.feature_names)
print(x)
['y' 'y' 'z']
[[1. 2. 3.]
[4. 5. 6.]]
this is a potentially surprising when features given to add_samplet in a different order -- even if feature and feature_names are paired correctly (@raamana -- a thing you warned me to check. good eye!)
ds = RegrDataset()
ds.descritpion="extra of feauture names"
ds.add_samplet('id1', target=100, features=[1,2,3], feature_names=['x','y','z'])
ds.add_samplet('id2', target=200, features=[6,5,4], feature_names=['z','y','x'])
[x, _, _] = ds.data_and_targets()
print(ds.feature_names)
print(x)
['z' 'y' 'x']
[[1. 2. 3.]
[6. 5. 4.]]
I had a few bugs (using wrong variable name), and realized I never got yelled at for providing bad feature names.
A few observations:
there can be too many (x, y, z and an additional "DNE" name)
or too few (only x, but have x, y, and z)
this is a potentially surprising when features given to
add_sampletin a different order -- even iffeatureandfeature_namesare paired correctly (@raamana -- a thing you warned me to check. good eye!)