Extended maintenance of Ruby 1.9.3 ended on February 23, 2015. Read more

URI

URI is a module providing classes to handle Uniform Resource Identifiers (RFC2396)

Features

  • Uniform handling of handling URIs

  • Flexibility to introduce custom URI schemes

  • Flexibility to have an alternate URI::Parser (or just different patterns and regexp's)

Basic example

require 'uri'
uri = URI("http://foo.com/posts?id=30&limit=5#time=1305298413")
#=> #<URI::HTTP:0x00000000b14880
 URL:http://foo.com/posts?id=30&limit=5#time=1305298413>
uri.scheme
#=> "http"
uri.host
#=> "foo.com"
uri.path
#=> "/posts"
uri.query
#=> "id=30&limit=5"
uri.fragment
#=> "time=1305298413"
uri.to_s
#=> "http://foo.com/posts?id=30&limit=5#time=1305298413"

Adding custom URIs

module URI
 class RSYNC < Generic
 DEFAULT_PORT = 873
 end
 @@schemes['RSYNC'] = RSYNC
end
#=> URI::RSYNC
URI.scheme_list
#=> {"FTP"=>URI::FTP, "HTTP"=>URI::HTTP, "HTTPS"=>URI::HTTPS,
 "LDAP"=>URI::LDAP, "LDAPS"=>URI::LDAPS, "MAILTO"=>URI::MailTo,
 "RSYNC"=>URI::RSYNC}
uri = URI("rsync://rsync.foo.com")
#=> #<URI::RSYNC:0x00000000f648c8 URL:rsync://rsync.foo.com>

RFC References

A good place to view an RFC spec is www.ietf.org/rfc.html

Here is a list of all related RFC's.

Class tree

Copyright Info

Author

Akira Yamada <akira@ruby-lang.org>

Documentation

Akira Yamada <akira@ruby-lang.org> Dmitry V. Sabanin <sdmitry@lrn.ru> Vincent Batts <vbatts@hashbangbash.com>

License

Copyright © 2001 akira yamada <akira@ruby-lang.org> You can redistribute it and/or modify it under the same term as Ruby.

Revision

$Id: uri.rb 31555 2011年05月13日 20:03:21Z drbrain $

Public Class Methods

decode_www_form(str, enc=Encoding::UTF_8) click to toggle source

Decode URL-encoded form data from given str.

This decodes application/x-www-form-urlencoded data and returns array of key-value array. This internally uses ::decode_www_form_component.

charset hack is not supported now because the mapping from given charset to Ruby's encoding is not clear yet. see also www.w3.org/TR/html5/syntax.html#character-encodings-0

This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data

ary = ::decode_www_form("a=1&a=2&b=3") p ary #=> [['a', '1'], ['a', '2'], ['b', '3']] p ary.assoc('a').last #=> '1' p ary.assoc('b').last #=> '3' p ary.rassoc('a').last #=> '2' p Hash # => {"a"=>"2", "b"=>"3"}

See ::decode_www_form_component, ::encode_www_form

 
 # File uri/common.rb, line 972
def self.decode_www_form(str, enc=Encoding::UTF_8)
 return [] if str.empty?
 unless /\A#{WFKV_}=#{WFKV_}(?:[;&]#{WFKV_}=#{WFKV_})*\z/o =~ str
 raise ArgumentError, "invalid data of application/x-www-form-urlencoded (#{str})"
 end
 ary = []
 $&.scan(/([^=;&]+)=([^;&]*)/) do
 ary << [decode_www_form_component(1ドル, enc), decode_www_form_component(2ドル, enc)]
 end
 ary
end
 
decode_www_form_component(str, enc=Encoding::UTF_8) click to toggle source

Decode given str of URL-encoded form data.

This decods + to SP.

See ::encode_www_form_component, ::decode_www_form

 
 # File uri/common.rb, line 897
def self.decode_www_form_component(str, enc=Encoding::UTF_8)
 raise ArgumentError, "invalid %-encoding (#{str})" unless /\A[^%]*(?:%\h\h[^%]*)*\z/ =~ str
 str.gsub(/\+|%\h\h/, TBLDECWWWCOMP_).force_encoding(enc)
end
 
encode_www_form(enum) click to toggle source

Generate URL-encoded form data from given enum.

This generates application/x-www-form-urlencoded data defined in HTML5 from given an Enumerable object.

This internally uses ::encode_www_form_component.

This method doesn't convert the encoding of given items, so convert them before call this method if you want to send data as other than original encoding or mixed encoding data. (Strings which are encoded in an HTML5 ASCII incompatible encoding are converted to UTF-8.)

This method doesn't handle files. When you send a file, use multipart/form-data.

This is an implementation of www.w3.org/TR/html5/forms.html#url-encoded-form-data

URI.encode_www_form([["q", "ruby"], ["lang", "en"]])
#=> "q=ruby&lang=en"
URI.encode_www_form("q" => "ruby", "lang" => "en")
#=> "q=ruby&lang=en"
URI.encode_www_form("q" => ["ruby", "perl"], "lang" => "en")
#=> "q=ruby&q=perl&lang=en"
URI.encode_www_form([["q", "ruby"], ["q", "perl"], ["lang", "en"]])
#=> "q=ruby&q=perl&lang=en"

See ::encode_www_form_component, ::decode_www_form

 
 # File uri/common.rb, line 930
def self.encode_www_form(enum)
 enum.map do |k,v|
 if v.nil?
 encode_www_form_component(k)
 elsif v.respond_to?(:to_ary)
 v.to_ary.map do |w|
 str = encode_www_form_component(k)
 unless w.nil?
 str << '='
 str << encode_www_form_component(w)
 end
 end.join('&')
 else
 str = encode_www_form_component(k)
 str << '='
 str << encode_www_form_component(v)
 end
 end.join('&')
end
 
encode_www_form_component(str) click to toggle source

Encode given str to URL-encoded form data.

This method doesn't convert *, -, ., 0-9, A-Z, _, a-z, but does convert SP (ASCII space) to + and converts others to %XX.

This is an implementation of www.w3.org/TR/html5/forms.html#url-encoded-form-data

See ::decode_www_form_component, ::encode_www_form

 
 # File uri/common.rb, line 880
def self.encode_www_form_component(str)
 str = str.to_s
 if HTML5ASCIIINCOMPAT.include?(str.encoding)
 str = str.encode(Encoding::UTF_8)
 else
 str = str.dup
 end
 str.force_encoding(Encoding::ASCII_8BIT)
 str.gsub!(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_)
 str.force_encoding(Encoding::US_ASCII)
end
 
extract(str, schemes = nil, &block) click to toggle source

Synopsis

URI::extract(str[, schemes][,&blk])

Args

str

String to extract URIs from.

schemes

Limit URI matching to a specific schemes.

Description

Extracts URIs from a string. If block given, iterates through all matched URIs. Returns nil if block given or array with matches.

Usage

require "uri"
URI.extract("text here http://foo.example.org/bla and here mailto:test@example.com and here also.")
# => ["http://foo.example.com/bla", "mailto:test@example.com"]
 
 # File uri/common.rb, line 812
def self.extract(str, schemes = nil, &block)
 DEFAULT_PARSER.extract(str, schemes, &block)
end
 
join(*str) click to toggle source

Synopsis

URI::join(str[, str, ...])

Args

str

String(s) to work with

Description

Joins URIs.

Usage

require 'uri'
p URI.join("http://example.com/","main.rbx")
# => #<URI::HTTP:0x2022ac02 URL:http://localhost/main.rbx>
p URI.join('http://example.com', 'foo')
# => #<URI::HTTP:0x01ab80a0 URL:http://example.com/foo>
p URI.join('http://example.com', '/foo', '/bar')
# => #<URI::HTTP:0x01aaf0b0 URL:http://example.com/bar>
p URI.join('http://example.com', '/foo', 'bar')
# => #<URI::HTTP:0x801a92af0 URL:http://example.com/bar>
p URI.join('http://example.com', '/foo/', 'bar')
# => #<URI::HTTP:0x80135a3a0 URL:http://example.com/foo/bar>
 
 # File uri/common.rb, line 784
def self.join(*str)
 DEFAULT_PARSER.join(*str)
end
 
parse(uri) click to toggle source

Synopsis

URI::parse(uri_str)

Args

uri_str

String with URI.

Description

Creates one of the URI's subclasses instance from the string.

Raises

URI::InvalidURIError

Raised if URI given is not a correct one.

Usage

require 'uri'
uri = URI.parse("http://www.ruby-lang.org/")
p uri
# => #<URI::HTTP:0x202281be URL:http://www.ruby-lang.org/>
p uri.scheme
# => "http"
p uri.host
# => "www.ruby-lang.org"
 
 # File uri/common.rb, line 746
def self.parse(uri)
 DEFAULT_PARSER.parse(uri)
end
 
regexp(schemes = nil) click to toggle source

Synopsis

URI::regexp([match_schemes])

Args

match_schemes

Array of schemes. If given, resulting regexp matches to URIs whose scheme is one of the match_schemes.

Description

Returns a Regexp object which matches to URI-like strings. The Regexp object returned by this method includes arbitrary number of capture group (parentheses). Never rely on it's number.

Usage

require 'uri'
# extract first URI from html_string
html_string.slice(URI.regexp)
# remove ftp URIs
html_string.sub(URI.regexp(['ftp'])
# You should not rely on the number of parentheses
html_string.scan(URI.regexp) do |*matches|
 p $&
end
 
 # File uri/common.rb, line 847
def self.regexp(schemes = nil)
 DEFAULT_PARSER.make_regexp(schemes)
end
 
scheme_list() click to toggle source

Returns a Hash of the defined schemes

 
 # File uri/common.rb, line 659
def self.scheme_list
 @@schemes
end
 
split(uri) click to toggle source

Synopsis

URI::split(uri)

Args

uri

String with URI.

Description

Splits the string on following parts and returns array with result:

* Scheme
* Userinfo
* Host
* Port
* Registry
* Path
* Opaque
* Query
* Fragment

Usage

require 'uri'
p URI.split("http://www.ruby-lang.org/")
# => ["http", nil, "www.ruby-lang.org", nil, nil, "/", nil, nil, nil]
 
 # File uri/common.rb, line 711
def self.split(uri)
 DEFAULT_PARSER.split(uri)
end
 

AltStyle によって変換されたページ (->オリジナル) /