73 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
0
votes
1
answer
49
views
Extract the "column name" of a data frame scraped by tabulapdf in R
I am using tabulizer/tabulapdf to scrape a table from a pdf. My scripts worked a few months ago but now I'm getting a data frame that I'm unfamiliar with - and it's throwing errors. The issue seems ...
0
votes
1
answer
130
views
Trying to systematically extract information (text + table) from pdf in R
For a project, I need to extract information from a PDF file that is not available anywhere else. As I'm talking about thousands of data, I do not want to type it manually since this is error-prone.
...
0
votes
1
answer
309
views
R Extract Table Function from Tabulizer to Data Frame
I'm trying to extract tables from PDFs using the Tabulizer library. I extracted the 1st page with no issue and then converted it to a data frame. After that, I was just cutting the edges of all data ...
0
votes
0
answers
195
views
Issues installing tabulizer in R on Mac OS13 and R 4.4
I'm having many issues installing tabulizer in R 4.4 on my Mac OS13. I've reproduced the error message below. I've tried every other suggestion on stack and nothing seems to do the trick. I've ...
0
votes
1
answer
52
views
Replace current variable names and move them into rows
After extracting tables from a PDF using tabulizer, my table looks like:
A
King
Blue
D
Queen
Red
T
Prince
Black
I want to move the variable names down as observations and replace them with a vector of ...
0
votes
0
answers
83
views
Error with PDF scraping using Tabulizer library
I'm trying to extract tables from several pdf files and used the Tabulizer library. However, as I use the extract_tables function, I keep getting this error:
Error in .jcall("RJavaTools", &...
1
vote
1
answer
1k
views
Scraping two-column PDF
I try to scrape the texts of hundreds of PDFs for a project.
The PDFs have title pages, headers, footers and two columns. I tried the packages pdftools and tabulizer. However, both have their ...
2
votes
2
answers
556
views
RStudio fatal error when loading tabulizer
I recently updated R to version 4.2.0 on my Windows 10 PC. When I try to load the package tabulizer, RStudio crashes and the bomb icon with the correspondent "R encountered a fatal error" ...
1
vote
1
answer
3k
views
How can I install the package 'tabulizer'?
I need to work with the "tabulizer" library in R but when installing the package it shows me the following message: "Installing package into 'C:/Users/Usuario/Documents/R/win-library/4....
0
votes
1
answer
409
views
Import all tables from PDF or html to R
I am trying to import tables from a website to R. The data is shown in the html as well as a downloadable PDF.
I have tried using the tabulizer package on the PDF, specifically the expand_tables() and ...
5
votes
0
answers
970
views
Trying to resolve Java issue when running Tabulizer in R
I am trying to extract tables from pdfs in R using tabulizer, and keep getting this error when I use extract_tables.
Error in .jcall("RJavaTools", "Ljava/lang/Object;", "...
0
votes
0
answers
381
views
extract_tables function status was 'SSL connect error' error
I posed a similar question in Github. However, as I could not receive reply, I just wanted to post it here in case someone can help me on this issue. Thank you for your help beforehand.
During the ...
4
votes
5
answers
14k
views
Having Issues installing tabulizer package in R
I had a script working with tabulizer, but had to clean my hard drive and reinstall R, and now I cant seem to even download and access the tabulizer library. I am now using R version 4.1.2 64 bit, and ...
1
vote
0
answers
295
views
Error with extract_tables function when running in it in a jupyter-notebook but not in console
library(tabulizer)
f <- system.file("examples", "data.pdf", package = "tabulizer")
f1 <- extract_tables(f,output = "data.frame")
f1[[1]]
Running the ...
1
vote
0
answers
606
views
Extract table from PDF in R
I am new to R and I want to extract data from a PDF.
Some context, I have followed a tutorial to setup rJava and then tried to run the code:
pacman::p_load(
rJava,
tabulizer,
tidyverse)
Df <-...