this was dynamically generated using some complex app logic to choose the link title based on transient data
solution: use the site output page as input
Detail:
Here's the markup we wanted:
<tbody><tr><th class="position">1.</th>
<td class="source"><cite class="link">
<a href="http://news.ft.com/cms/s/36048bf8-0ff7-11d9-ba62-00000e2511c8.html">
No French or German turn on Iraq</a>
</cite></td></tr>
easy to parse out as it is semantic - use an xml parser and find the right nodes
extract the url and text and query the database for link history
rel="vote-for" or "vote-against" or "vote-abstain"
agreement/reviews/motions
building blocks - Citations and quotes
<cite>
- cite a person or source by name
<cite><a href="">
- cite a linked page
<q>
- a brief inline quotation
<blockquote>
- A longer quotation
both these have a
cite
attribute for a link
<q cite="http://...">
building blocks - Citations and quotes examples
<cite>Oscar Wilde</cite> said:
<blockquote cite ="http://www.quotedb.com/quotes/95">
"A cynic is a man who knows the price of everything and
the value of nothing."
</blockquote>
building blocks - Lists
Programs have arrays, XHTML has lists
<ol><li>
- ordered array
<ul><li>
- unordered collection
Can nest these
building blocks - Tables
Can be used for a 2D array
<table>
- define grid
<thead><tr><th><th>
- column heads
<tbody><tr><th>
- row heads
<td>data</td>
- elements
building blocks - Definition lists
these correspond approximately with dictionaries
(hashtables)
<dl>
<dt>key</dt>
<dd>value</dd>
</dl>
uniqueness constraints not explicit
Existing examples
systematizing existing behaviour using these building blocks
<div><h3>professional</h3>
<dl>
<dt id="co-worker">co-worker</dt>
<dd>Someone a person works with, or works at the same organization as. Symmetric. Usually transitive.</dd>
<dt id="colleague">colleague</dt>
<dd>Someone in the same field of study/activity. Symmetric. Often transitive.</dd>
</dl>
</div>