"Watermarking" Text Data
There are some ways to sort of watermark text: Certain letters and numbers
tend to look so much like each other that replacing them in spelling
inappropriate ways could be used to mark the text without appearing to. This
can be used to detect the source of the text when it has been copied and
pasted and to provide evidence of plagiarism. An example probably explains
it better:
Love them tigers! = Love thern tigers!
Open source = 0pen source
I love you = I 1ove you
and so on. These same sorts of replacements are commonly used by spammers
to try to sneak past word fi!ters for commonly spammed items. e.g. that male
erection stuff. 0ver the course of a few words, or in a headline or subject,
there are sort of easy to see, but in a long bit of text, you can miss them
pretty easily. (go back and look at the word "Over" in the last sentence,
and the more obvious "filters" in the prior).
If the copied text has YOUR specific pattern of replacements (e.g. always
replace the 3rd "O" with "0" and the 1st and 4th "l" with "1" and...) then
you know it came from you. A script can easily parse the text, looking for
these marks and ring a bell when it finds them.
O = 0
I = 1, !
m = rn
V = \/
" = ''
Other possibilites include:
-
Double spaces (very effective in HTML as only one space will be displayed)
-
Depending on the characterset there are often extended symboles, some of
which are close enough to the standards. E.g. i í
-
Inconsistant use of abbreviations. E.g. "I live on Happy Ave. but my friend
lives on Sad Street"
-
Inserting text into HTML with tags that cause it to not be displayed, but
that will still allow it to be copied. E.g. You see this, <font
style="display:none" color="#FFFFFF">but not this, although it is
copied</font>
-
Replacing certain letters or symbols with a picture of that symbol. This
causes the item to display, but not be copied.
file: /Techref/datafile/textwatermarks.htm,
2KB, , updated: 2008年3月14日 12:03, local time: 2025年9月22日 10:52,
©2025 These pages are served without commercial sponsorship. (No popup ads, etc...).Bandwidth abuse increases hosting cost forcing sponsorship or shutdown. This server aggressively defends against automated copying for any reason including offline viewing, duplication, etc... Please respect this requirement and DO NOT RIP THIS SITE.
Questions?<A HREF="http://techref.massmind.org/Techref/datafile/textwatermarks.htm"> "Watermarking" Text Data</A>
Did you find what you needed?
Welcome to massmind.org!
Welcome to techref.massmind.org!
.