1

I am using miller , to work with .csv files. There is an issue though that I dont manage how to get around. So I have 2 .csv files.

product-feeder.csv

name-manufacturer,id-product,name-product,id-manufacturer,id-supplier,image-posted
MANUFACTURER ONE,0,PRODUCT A,0,0,0
MANUFACTURER TWO,0,PRODUCT B,0,0,0
MANUFACTURER ONE,0,UNMATCHED PRODUCT ONE,0,0,0
UNKOWN MANUFACTURER,0,UNMATCHED PRODUCT TWO,0,0,0

and manufacturer-feeder.csv

id-manufacturer,name-manufacturer,link-rewrite,image-posted
114,MANUFACTURER ONE,manu-one,1
116,MANUFACTURER TWO,manu-two,1

I just want to change the values on the column of product-feeder.csv , ${id-manufacturer} , and place them according to what it matches on manufacturer-feeder.csv , the ${name-manufacturer} of product-feeder.csv that ${id-manufacturer} having as a result

name-manufacturer,id-product,name-product,id-manufacturer,id-supplier,image-posted
MANUFACTURER ONE,0,PRODUCT A,114,0,0
MANUFACTURER TWO,0,PRODUCT B,116,0,0
MANUFACTURER ONE,0,UNMATCHED PRODUCT ONE,114,0,0
UNKOWN MANUFACTURER,0,UNMATCHED PRODUCT TWO,0,0,0

I think in the Spreadsheet world this function would be called VLOOKUP Do you know how to tackle this with miller ?

David Maze
165k47 gold badges263 silver badges305 bronze badges
asked Dec 20, 2025 at 14:51

1 Answer 1

1

A solution is to use the join verb

Running

mlr --csv join --ul \
 -j name-manufacturer \
 --rp "r_" \
 -f product-feeder.csv \
then unsparsify \
then put '${id-manufacturer}=${r_id-manufacturer}' \
then cut -x -r -f "^r_.+" \
manufacturer-feeder.csv
+---------------------+------------+-----------------------+-----------------+-------------+--------------+
| name-manufacturer | id-product | name-product | id-manufacturer | id-supplier | image-posted |
+---------------------+------------+-----------------------+-----------------+-------------+--------------+
| MANUFACTURER ONE | 0 | PRODUCT A | 114 | 0 | 0 |
| MANUFACTURER ONE | 0 | UNMATCHED PRODUCT ONE | 114 | 0 | 0 |
| MANUFACTURER TWO | 0 | PRODUCT B | 116 | 0 | 0 |
| UNKOWN MANUFACTURER | 0 | UNMATCHED PRODUCT TWO | | 0 | 0 |
+---------------------+------------+-----------------------+-----------------+-------------+--------------+

Explanation:

  • A left join (--ul) is performed on name-manufacturer, keeping all
    records from manufacturer-feeder.csv.
  • Fields coming from product-feeder.csv are prefixed with r_ to avoid
    collisions.
  • unsparsify normalizes records after the join.
  • The value r_id-manufacturer is copied into id-manufacturer.
  • All temporary r_ fields are removed at the end.
answered Dec 20, 2025 at 15:37
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.