-
Notifications
You must be signed in to change notification settings - Fork 41
Description
This is a bit similar to the #157 issue with column being Maybe or not depending on luck.
Here the problem is between Double and Int. Yesterday I tried to rerun something I've done last week, fed the system with new file and suddenly, a column which was a Double is now Int. In fairness the column has always contains natural numbers but to be mixed other columns it is either to see it as a Double.
Changing 0 to 0.0 in the cvs file fixed the problem but it wasn't initially easy to find. The error was a runtime error and I don't remember if the column name was actually spelled out on the error.
I can see two way of solving the problem:
1 - specifying Double when reading the csv
2 - casting somehow the column to an Double.
For 1) I understand that there is a readCsvWithOptions which allow to pass a schema. However I just want to specify one column not all of them : AFAU, readCsv can Either infer from sample or use a schema but not both (.i.e use the schema for some columns and infer from the rest.
For 2) I could use F.lift round or apply round but I think both will crash the column is already a Double (This is similar problem to impute issue in #157).
Maybe we need a way to apply, lift a function only if matches the g - iven type but do nothing instead of crashing . This could be done by either
- introducing new functions,
- new functions returning
Eitheras inapplyEither :: Expr a -> ( a -> b) -> Either b b - maybe a new wrapper as in
apply (F.col @(DontCrashIfIAMNoAnA Int) round - or a new type of column
apply (F.colWhichIsPossiblyAn Int) round.