After trying a couple of parsers, such as Log Parser 2.2, I ended up writing a small utility that supposedly parses the following information onto an object:
(from DirectX Caps Viewer, truncated)
DirectX Graphics Adapters
AMD Radeon HD 7800 Series
Driver aticfx32.dll
Description AMD Radeon HD 7800 Series
DriverVersion 524305656590
VendorId 4,098
DeviceId 26,649
SubSysId -501,147,829
Revision 0
DeviceIdentifier {D7B71EE2-2B59-11CF-3570-2BC2BEC2C535}
WHQLLevel 1
Display Modes
640 x 480 D3DFMT_X8R8G8B8 60
1920 x 1200 D3DFMT_R5G6B5 60
D3D Device Types
HAL
Caps
DeviceType 1
MaxPixelShaderValue 3.40282E+038
Caps
D3DCAPS_READ_SCANLINE Yes
Caps2
D3DCAPS2_CANCALIBRATEGAMMA No
D3DCAPS2_DYNAMICTEXTURES Yes
Caps3
D3DCAPS3_ALPHA_FULLSCREEN_FLIP_OR_DISCARD Yes
PresentationIntervals
D3DPRESENT_INTERVAL_ONE Yes
D3DPRESENT_INTERVAL_TWO Yes
D3DPRESENT_INTERVAL_THREE Yes
D3DPRESENT_INTERVAL_FOUR Yes
D3DPRESENT_INTERVAL_IMMEDIATE Yes
CursorCaps
D3DCURSORCAPS_COLOR Yes
D3DCURSORCAPS_LOWRES No
Adapter Formats
D3DFMT_X8R8G8B8 (Fullscreen)
Back Buffer Formats
D3DFMT_A8R8G8B8
MultiSample Types
D3DMULTISAMPLE_NONE
D3DMULTISAMPLE_2_SAMPLES
Here's the code. It basically uses a regular expression to differentiate whether each line is an entry or a category. Then, using list
, it will add that current item to its parent.
internal class Program
{
private static void Main(string[] args)
{
using (var reader = new StreamReader(File.OpenRead(@"..\..\hd7850all.log")))
{
var categories =
new List<Tuple<int, Category>>(new[] {new Tuple<int, Category>(-1, new Category {Name = "root"})});
while (!reader.EndOfStream)
{
string s = reader.ReadLine();
if (s == null) throw new InvalidOperationException("Input string cannot be null");
//bool isCategory = Regex.IsMatch(s, @"^[\w]");
//bool isSubCategory = Regex.IsMatch(s, @"^ +(?!.* {2,})");
bool isAnyCategory = Regex.IsMatch(s, @"^ *(?!.* {2,})");
bool isEntry = Regex.IsMatch(s, @"(?<!^ *) {2,}");
int depth = Regex.Match(s, "^ *").Length;
string name = Regex.Replace(s, "^ *", string.Empty);
var category = new Category {Name = name};
var value = new Tuple<int, Category>(depth, category);
if (isAnyCategory)
{
for (int i = categories.Count - 1; i >= 0; i--)
{
Tuple<int, Category> tuple = categories[i];
if (tuple.Item1 < depth)
{
tuple.Item2.Categories.Add(category);
break;
}
}
categories.Add(value);
}
else if (isEntry)
{
Tuple<int, Category> tuple = categories.Last();
tuple.Item2.Entries.Add(new Entry {Name = name});
}
}
Category root = categories[0].Item2;
}
}
}
public class Category
{
public Category()
{
Categories = new List<Category>();
Entries = new List<Entry>();
}
public string Name { get; set; }
public List<Entry> Entries { get; set; }
public List<Category> Categories { get; set; }
public override string ToString()
{
return Name;
}
}
public class Entry
{
public string Name { get; set; }
public override string ToString()
{
return Name;
}
}
Do you know of a more efficient method or approach to transform such data to an object?
-
1\$\begingroup\$ Yes, this question is on-topic. \$\endgroup\$Jamal– Jamal2013年10月14日 17:02:17 +00:00Commented Oct 14, 2013 at 17:02
-
\$\begingroup\$ Why are you looking for a more efficient approach? Is your code too slow? If it is, what did profiling tell you? What is the slowest part? \$\endgroup\$svick– svick2013年10月15日 21:57:03 +00:00Commented Oct 15, 2013 at 21:57
-
\$\begingroup\$ Not really, it does work pretty well though some categories are seen as entries instead. It's more about getting a different point of view. And well, what you see here took me quite some time to get to it as, as usual I started with a complex solution and after many iterations I'm satisfied with it and it's really simple now. In short, I am expecting a review about it ! \$\endgroup\$aybe– aybe2013年10月15日 23:30:30 +00:00Commented Oct 15, 2013 at 23:30
1 Answer 1
Your usage of whitespaces in regex patterns threw me off, I think they should be captured with \s
.
Also your object model doesn't seem to reflect the file's format. I see this:
GraphicAdapter
.Driver
(="aticfx32.dll").Description
(="AMD Radeon HD 7800 Series").DriverVersion
(="524305656590").VendorId
(="4,098").DeviceId
(="26,649").SubSysId
(="-501,147,829").Revision
(="0").DeviceIdentifier
(="{D7B71EE2-2B59-11CF-3570-2BC2BEC2C535}").WHQLLevel
(="1").DisplayModes
.DeviceTypes
DisplayMode
.Resolution
(="640 x 480").WhateverThatIs
(="D3DFMT_X8R8G8B8").RefreshRateHz
(="60")
DeviceType
.Identifier
(="HAL").Caps
.AdapterFormats
And so on and so forth - what I mean is that there's much more than "entry" and "category" going on here, by merely capturing "entry" and "category" you're missing out on the richness of the data you're parsing.
-
\$\begingroup\$ While strongly typed access to data (similar to types generated from XML schema, if this was XML) is nice, I think there is a place for a more weak approach too (more akin to plain LINQ to XML). \$\endgroup\$svick– svick2013年11月27日 02:29:39 +00:00Commented Nov 27, 2013 at 2:29