lua-users home
lua-l archive

Capture patterns

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hello,
I am attempting to capture a series of numbers (all dollar amounts). I have
experimented with multiple patterns to no avail. I created this small
program to demonstrate my problem.
I am reading PDF reports and pulling dollar amounts and a description. My
pattern is catching periods (used to indicate abbreviations) in the
description. I am unable to figure out how to build a pattern to do this.
Here's the layout:
Description 0.00 3,587.46 (125,000.00)
This description may contain parentheses or periods. The dollar figures are
in US standard accounting format where numbers enclosed in parentheses mean
negative values. The program contains sample data and you can see the
problem.
--
 function extract(string1)
--
-- culls the string at the first digit in the report--
--
 local s, e = string.find(string1, "[%(]*%d+")
 if s == nil then return nil end
 local item = string.sub(string1, 1, s - 1)
 return item
 end
--
--
--
 function trim(s) 
 return (string.gsub(s, "^%s*(.-)%s*$","%1")) 
 end
--
-- 
-- 10/29/07 handle negatives in pattern. DOS uses "-" at end of number
-- 
 
 local record = nil
 local i = 0
 local nums = {}
 local line_data = {}
--
--/////////////////////////////////////////////////////////
--
 local pattern = "[%-%(%$]*[%d%,]*[%d%.%d%d]+[%)%-]*"
-- 
--/////////////////////////////////////////////////////////
-- 
 
print("****************************Debug13*********************************"
)
 print("debug13- Version 1.0 1/07/08")
 print("debug13- Build CSV data from PDF listing of spreadsheet.", "\n\n")
 line_data[1] = "This is good data 4.00 3.99 (1,768.50)"
 line_data[2] = "Data- In a line 5.00 (100,000.00) 957,123.45"
 line_data[3] = "Repairs- () (Wages) 123,456.99 28,123.45 650.00"
 line_data[4] = "Repairs- Ex. Wages 50,120.00 500.00 1,000.00"
 line_data[5] = "A bunch of negatives (123,456.89) (123,456.90)
(123,456.91)" 
 
 for i = 1, #line_data do
 
 for num in string.gmatch(line_data[i], pattern) do 
 nums[#nums + 1] = num 
 end --do for
	 
	 print(#nums, " <=== Number of captured numbers")
 
 if #nums == 3 then
 descr = trim(extract(line_data[i])) 
 print(descr, nums[1], nums[2], nums[3])
 end --if 
	 
	 if #nums == 4 then
 descr = trim(extract(line_data[i])) 
 print(descr, nums[1], nums[2], nums[3], nums[4])
 end --if 
 
 nums = {}
 
 print("----------------Loop Separator-------------")
 
 end --do 
 
 print("debug13- End of execution")
CONFIDENTIALITY NOTICE: This E-mail message and all attachments, which originated from Sealy Management Company Inc, are intended solely for the use of the intended recipient or entity and may contain legally privileged and confidential information. If the reader of this message is not the intended recipient, you are hereby notified that any reading, disclosure, dissemination, distribution, copying or other use of this message is strictly prohibited. If you have received this message in error, please notify the sender of the message immediately and delete this message and all attachments, including all copies or backups thereof, from your system. You may also reach us by phone at 205-391-6000. Thank you.

AltStyle によって変換されたページ (->オリジナル) /