Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Banyc/dfsql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

History

121 Commits

Repository files navigation

dfsql

  • Revision: the standalone count command is replaced with len, so make sure to replace (count) and col "count" with len and col "len" respectively.
    • the unary count <col> command is unaffected.

Install

cargo install dfsql

How to run

dfsql --input your.csv --output a-new.csv
# ...or
dfsql -i your.csv -o a-new.csv

REPL

  • exit/quit: exit the REPL loop.
    exit
  • undo: undo the previous successful operation.
    undo
  • reset: reset all the changes and go back to the original data frame.
    reset
  • schema: show column names and types of the data frame.
    schema
  • save: save the current data frame to a file.
    save a-new.csv

Statements

  • select
    select <expr>*
    select last_name first_name
    • Select columns "last_name" and "first_name" and collect them into a data frame.
  • Group by
    group (<col> | <var>)* agg <expr>*
    group first_name agg (count)
    • Group the data frame by column "first_name" and then aggregate each group with the count of the members.
  • filter
    filter <expr>
    filter first_name = "John"
  • limit
    limit <int>
    limit 5
  • reverse
    reverse
  • sort
    sort ((asc | desc | ()) <col>)*
    sort icpsr_id
  • use
    use <var>
    use other
    • Switch to the data frame called other.
  • join
    (left | right | inner | full) join <var> on <col> <col>?
    left join other on id ID
    • left join the data frame called other on my column id and its column ID

Expressions

  • col: reference to a column.
    col : (<str> | <var>) -> <expr>
    select col first_name
  • exclude: remove columns from the data frame.
    exclude : <expr>* -> <expr>
    select exclude last_name first_name
  • literal: literal values like 42, "John", 1.0, and null.
  • binary operations
    select a * b
    • Calculate the product of columns "a" and "b" and collect the result.
  • unary operations
    select -a
    select sum a
    • Sum all values in column "a" and collect the scalar result.
  • alias: assign a name to a column.
    alias : (<col> | <var>) <expr> -> <expr>
    select alias product a * b
    • Assign the name "product" to the product and collect the new column.
  • conditional
    <conditional> : if <expr> then <expr> (if <expr> then <expr>)* otherwise <expr> -> <expr>
    select if class = 0 then "A" if class = 1 then "B" else null
  • cast: cast a column to either type str, int, or float.
    cast : <type> <expr> -> <expr>
    select cast str id
    • Cast the column "id" to type str and collect the result.

About

SQL REPL/lib for Data Frames

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

Languages

AltStyle によって変換されたページ (->オリジナル) /