Skip to content

Double becoming Int when reading Csv #177

@maxigit

Description

@maxigit

This is a bit similar to the #157 issue with column being Maybe or not depending on luck.

Here the problem is between Double and Int. Yesterday I tried to rerun something I've done last week, fed the system with new file and suddenly, a column which was a Double is now Int. In fairness the column has always contains natural numbers but to be mixed other columns it is either to see it as a Double.

Changing 0 to 0.0 in the cvs file fixed the problem but it wasn't initially easy to find. The error was a runtime error and I don't remember if the column name was actually spelled out on the error.

I can see two way of solving the problem:
1 - specifying Double when reading the csv
2 - casting somehow the column to an Double.

For 1) I understand that there is a readCsvWithOptions which allow to pass a schema. However I just want to specify one column not all of them : AFAU, readCsv can Either infer from sample or use a schema but not both (.i.e use the schema for some columns and infer from the rest.

For 2) I could use F.lift round or apply round but I think both will crash the column is already a Double (This is similar problem to impute issue in #157).

Maybe we need a way to apply, lift a function only if matches the g - iven type but do nothing instead of crashing . This could be done by either

  • introducing new functions,
  • new functions returning Either as in applyEither :: Expr a -> ( a -> b) -> Either b b
  • maybe a new wrapper as in apply (F.col @(DontCrashIfIAMNoAnA Int) round
  • or a new type of column apply (F.colWhichIsPossiblyAn Int) round.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions