143

I am pulling a JSON file from a site and one of the strings received is:

The Weeknd ‘King Of The Fall’ [Video Premiere] | @TheWeeknd | #SoPhi

How can I convert things like &#8216 into the correct characters?

I've made a Xcode Playground to demonstrate it:

import UIKit
var error: NSError?
let blogUrl: NSURL = NSURL.URLWithString("http://sophisticatedignorance.net/api/get_recent_summary/")
let jsonData = NSData(contentsOfURL: blogUrl)
let dataDictionary = NSJSONSerialization.JSONObjectWithData(jsonData, options: nil, error: &error) as NSDictionary
var a = dataDictionary["posts"] as NSArray
println(a[0]["title"])
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
asked Sep 1, 2014 at 13:47
0

23 Answers 23

194

This answer was last revised for Swift 5.2 and iOS 13.4 SDK.


There's no straightforward way to do that, but you can use NSAttributedString magic to make this process as painless as possible (be warned that this method will strip all HTML tags as well).

Remember to initialize NSAttributedString from main thread only. It uses WebKit to parse HTML underneath, thus the requirement.

// This is a[0]["title"] in your case
let htmlEncodedString = "The Weeknd <em>&#8216;King Of The Fall&#8217;</em>"
guard let data = htmlEncodedString.data(using: .utf8) else {
 return
}
let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
]
guard let attributedString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else {
 return
}
// The Weeknd ‘King Of The Fall’
let decodedString = attributedString.string
extension String {
 init?(htmlEncodedString: String) {
 guard let data = htmlEncodedString.data(using: .utf8) else {
 return nil
 }
 let options: [NSAttributedString.DocumentReadingOptionKey: Any] = [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
 ]
 guard let attributedString = try? NSAttributedString(data: data, options: options, documentAttributes: nil) else {
 return nil
 }
 self.init(attributedString.string)
 }
}
let encodedString = "The Weeknd <em>&#8216;King Of The Fall&#8217;</em>"
let decodedString = String(htmlEncodedString: encodedString)
answered Sep 1, 2014 at 14:03
Sign up to request clarification or add additional context in comments.

18 Comments

+1 for the answer, -1 for preferring an extension over a method. It will not be clear to the next developer that stringByConvertingFromHTML is an extension, clarity is the single most important attribute a program can have.
What? Extensions are meant to extend existing types to provide new functionality.
I understand what you're trying to say, but negating extensions isn't the way to go.
This method is extremely heavy and is not recommended in tableviews or gridviews
This is great! Although it blocks the main thread, is there any way to run it in the background thread?
|
104

@akashivskyy's answer is great and demonstrates how to utilize NSAttributedString to decode HTML entities. One possible disadvantage (as he stated) is that all HTML markup is removed as well, so

<strong> 4 &lt; 5 &amp; 3 &gt; 2</strong>

becomes

4 < 5 & 3 > 2

On OS X there is CFXMLCreateStringByUnescapingEntities() which does the job:

let encoded = "<strong> 4 &lt; 5 &amp; 3 &gt; 2 .</strong> Price: 12 &#x20ac;. &#64; "
let decoded = CFXMLCreateStringByUnescapingEntities(nil, encoded, nil) as String
println(decoded)
// <strong> 4 < 5 & 3 > 2 .</strong> Price: 12 €. @ 

but this is not available on iOS.

Here is a pure Swift implementation, for Swift 4 and later. It decodes character entities references like &lt; using a dictionary, and all numeric character entities like &#64 or &#x20ac. (Note that I did not list all 252 HTML entities explicitly.)

// Mapping from XML/HTML character entity reference to character
// From http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references
private let characterEntities : [ Substring : Character ] = [
 // XML predefined entities:
 "&quot;" : "\"",
 "&amp;" : "&",
 "&apos;" : "'",
 "&lt;" : "<",
 "&gt;" : ">",
 // HTML character entity references:
 "&nbsp;" : "\u{00a0}",
 // ...
 "&diams;" : "♦",
]
extension String {
 /// Returns a new string made by replacing in the `String`
 /// all HTML character entity references with the corresponding
 /// character.
 var stringByDecodingHTMLEntities : String {
 // ===== Utility functions =====
 // Convert the number in the string to the corresponding
 // Unicode character, e.g.
 // decodeNumeric("64", 10) --> "@"
 // decodeNumeric("20ac", 16) --> "€"
 func decodeNumeric(_ string : Substring, base : Int) -> Character? {
 guard let code = UInt32(string, radix: base),
 let uniScalar = UnicodeScalar(code) else { return nil }
 return Character(uniScalar)
 }
 // Decode the HTML character entity to the corresponding
 // Unicode character, return `nil` for invalid input.
 // decode("&#64;") --> "@"
 // decode("&#x20ac;") --> "€"
 // decode("&lt;") --> "<"
 // decode("&foo;") --> nil
 func decode(_ entity : Substring) -> Character? {
 if entity.hasPrefix("&#x") || entity.hasPrefix("&#X") {
 return decodeNumeric(entity.dropFirst(3).dropLast(), base: 16)
 } else if entity.hasPrefix("&#") {
 return decodeNumeric(entity.dropFirst(2).dropLast(), base: 10)
 } else {
 return characterEntities[entity]
 }
 }
 // ===== Method starts here =====
 var result = ""
 var position = startIndex
 // Find the next '&' and copy the characters preceding it to `result`:
 while let ampRange = self[position...].range(of: "&") {
 result.append(contentsOf: self[position ..< ampRange.lowerBound])
 position = ampRange.lowerBound
 // Find the next ';' and copy everything from '&' to ';' into `entity`
 guard let semiRange = self[position...].range(of: ";") else {
 // No matching ';'.
 break
 }
 let entity = self[position ..< semiRange.upperBound]
 if let decoded = decode(entity) {
 // Replace by decoded character:
 result.append(decoded)
 position = semiRange.upperBound
 } else {
 // Invalid entity, copy verbatim:
 result.append(contentsOf: self[ampRange])
 position = ampRange.upperBound
 }
 }
 // Copy remaining characters to `result`:
 result.append(contentsOf: self[position...])
 return result
 }
}

Example:

let encoded = "&<strong> 4 &lt; 5 &amp; 3 &gt; 2 .</strong> Price: 12 &#x20ac;. &#64; &"
let decoded = encoded.stringByDecodingHTMLEntities
print(decoded)
// &<strong> 4 < 5 & 3 > 2 .</strong> Price: 12 €. @ &
answered May 9, 2015 at 15:21

16 Comments

This is brilliant, thanks Martin! Here's the extension with the full list of HTML entities: gist.github.com/mwaterfall/25b4a6a06dc3309d9555 I've also slightly adapted it to provide the distance offsets made by the replacements. This allows the correct adjustment of any string attributes or entities that might be affected by these replacements (Twitter entity indices for example).
@MichaelWaterfall and Martin this is magnific! works like a charm! I update the extension for Swift 2 pastebin.com/juHRJ6au Thanks!
I converted this answer to be compatible with Swift 2 and dumped it in a CocoaPod called StringExtensionHTML for ease of use. Note that Santiago's Swift 2 version fixes the compile time errors, but taking out the strtooul(string, nil, base) entirely will cause the code not to work with numeric character entities and crash when it comes to an entity it doesn't recognize (instead of failing gracefully).
@AdelaChang: Actually I had converted my answer to Swift 2 already in September 2015. It still compiles without warnings with Swift 2.2/Xcode 7.3. Or are you referring to Michael's version?
|
36

Swift 4


  • String extension computed variable
  • Without extra guard, do, catch, etc...
  • Returns the original strings if decoding fails

extension String {
 var htmlDecoded: String {
 let decoded = try? NSAttributedString(data: Data(utf8), options: [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
 ], documentAttributes: nil).string
 return decoded ?? self
 }
}
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Nov 24, 2017 at 22:43

3 Comments

Wow ! works right out of the box for Swift 4 !. Usage // let encoded = "The Weeknd &#8216;King Of The Fall&#8217;" let finalString = encoded.htmlDecoded
I love the simplicity of this answer. However, it will cause crashes when run in the background because it tries to run on the main thread.
@JeremyHicks do you know how to fix this crash? It works perfect in Playground, but it crashes when I'm adding htmlDecoded to decoded json field. I added DispatchQuee.main.async but it not working
30

Swift 3 version of @akashivskyy's extension,

extension String {
 init(htmlEncodedString: String) {
 self.init()
 guard let encodedData = htmlEncodedString.data(using: .utf8) else {
 self = htmlEncodedString
 return
 }
 let attributedOptions: [String : Any] = [
 NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
 NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue
 ]
 do {
 let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
 self = attributedString.string
 } catch {
 print("Error: \(error)")
 self = htmlEncodedString
 }
 }
}
answered Sep 6, 2016 at 8:39

2 Comments

Works great. Original answer was causing weird crash. Thanks for update!
For french characters I have to use utf16
14

Swift 2 version of @akashivskyy's extension,

 extension String {
 init(htmlEncodedString: String) {
 if let encodedData = htmlEncodedString.dataUsingEncoding(NSUTF8StringEncoding){
 let attributedOptions : [String: AnyObject] = [
 NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
 NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding
 ]
 do{
 if let attributedString:NSAttributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil){
 self.init(attributedString.string)
 }else{
 print("error")
 self.init(htmlEncodedString) //Returning actual string if there is an error
 }
 }catch{
 print("error: \(error)")
 self.init(htmlEncodedString) //Returning actual string if there is an error
 }
 }else{
 self.init(htmlEncodedString) //Returning actual string if there is an error
 }
 }
 }
answered Dec 12, 2015 at 21:41

1 Comment

This code is incomplete and should be avoided by all means. The error is not being handled properly. When there is in fact an error code would crash. You should update your code to at least return nil when there is an error. Or you could just init with original string. In the end you should handle the error. Which is not the case. Wow!
13

I was looking for a pure Swift 3.0 utility to escape to/unescape from HTML character references (i.e. for server-side Swift apps on both macOS and Linux) but didn't find any comprehensive solutions, so I wrote my own implementation: https://github.com/IBM-Swift/swift-html-entities

The package, HTMLEntities, works with HTML4 named character references as well as hex/dec numeric character references, and it will recognize special numeric character references per the W3 HTML5 spec (i.e. &#x80; should be unescaped as the Euro sign (unicode U+20AC) and NOT as the unicode character for U+0080, and certain ranges of numeric character references should be replaced with the replacement character U+FFFD when unescaping).

Usage example:

import HTMLEntities
// encode example
let html = "<script>alert(\"abc\")</script>"
print(html.htmlEscape())
// Prints "&lt;script&gt;alert(&quot;abc&quot;)&lt;/script&gt;"
// decode example
let htmlencoded = "&lt;script&gt;alert(&quot;abc&quot;)&lt;/script&gt;"
print(htmlencoded.htmlUnescape())
// Prints "<script>alert(\"abc\")</script>"

And for OP's example:

print("The Weeknd &#8216;King Of The Fall&#8217; [Video Premiere] | @TheWeeknd | #SoPhi ".htmlUnescape())
// prints "The Weeknd ‘King Of The Fall’ [Video Premiere] | @TheWeeknd | #SoPhi "

Edit: HTMLEntities now supports HTML5 named character references as of version 2.0.0. Spec-compliant parsing is also implemented.

answered Sep 29, 2016 at 15:50

3 Comments

This is the most generic answer that works all the time, and not requiring being run on the main thread. This will work even with the most complex HTML escaped unicode strings (such as (&nbsp;͡&deg;&nbsp;͜ʖ&nbsp;͡&deg;&nbsp;)), whereas none of the other answers manage that.
Yeah, this should be way more up! :)
The fact that the original answer is not thread-safe is a very big issue for something so intrinsically low level as a string manipulation
9

Swift 4 Version

extension String {
 init(htmlEncodedString: String) {
 self.init()
 guard let encodedData = htmlEncodedString.data(using: .utf8) else {
 self = htmlEncodedString
 return
 }
 let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
 ]
 do {
 let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
 self = attributedString.string
 } 
 catch {
 print("Error: \(error)")
 self = htmlEncodedString
 }
 }
}
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Sep 30, 2017 at 9:16

4 Comments

I get "Error Domain=NSCocoaErrorDomain Code=259 "The file couldn’t be opened because it isn’t in the correct format."" when I try to use this. This goes away if I run the full do catch on the main thread. I found this from checking the NSAttributedString documentation: "The HTML importer should not be called from a background thread (that is, the options dictionary includes documentType with a value of html). It will try to synchronize with the main thread, fail, and time out."
Please, the rawValue syntax NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.documentType.rawValue) and NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.characterEncoding.rawValue) is horrible. Replace it with .documentType and .characterEncoding
@MickeDG - Can you please explain what exactly you did to resolve this error? I am getting it sporatically.
@RossBarbish - Sorry Ross, this was too long ago, can't remember the details. Have you tried what I suggest in the comment above, i.e. to run the full do catch on the main thread?
8
extension String{
 func decodeEnt() -> String{
 let encodedData = self.dataUsingEncoding(NSUTF8StringEncoding)!
 let attributedOptions : [String: AnyObject] = [
 NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
 NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding
 ]
 let attributedString = NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil, error: nil)!
 return attributedString.string
 }
}
let encodedString = "The Weeknd &#8216;King Of The Fall&#8217;"
let foo = encodedString.decodeEnt() /* The Weeknd ‘King Of The Fall’ */
answered Sep 1, 2015 at 16:48

3 Comments

Re "The Weeknd": Not "The Weekend"?
The syntax highlighting looks weird, especially the comment part of the last line. Can you fix it?
"The Weeknd" is a singer, and yes, that's the way his name is spelled.
7

Swift 4:

The total solution that finally worked for me with HTML code and newline characters and single quotes

extension String {
 var htmlDecoded: String {
 let decoded = try? NSAttributedString(data: Data(utf8), options: [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
 ], documentAttributes: nil).string
 return decoded ?? self
 }
}

Usage:

let yourStringEncoded = yourStringWithHtmlcode.htmlDecoded

I then had to apply some more filters to get rid of single quotes (for example, don't, hasn't, It's, etc.), and new line characters like \n:

var yourNewString = String(yourStringEncoded.filter { !"\n\t\r".contains(0ドル) })
yourNewString = yourNewString.replacingOccurrences(of: "\'", with: "", options: NSString.CompareOptions.literal, range: nil)
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Aug 16, 2018 at 19:44

3 Comments

This is essentially a copy of this other answer. All you did is add some usage which is obvious enough.
some one has upvoted this answer and found it really useful, what does that tell you ?
@Naishta It tells you that everyone has different opinions and that's OK
5

This would be my approach. You could add the entities dictionary from https://gist.github.com/mwaterfall/25b4a6a06dc3309d9555 Michael Waterfall mentions.

extension String {
 func htmlDecoded()->String {
 guard (self != "") else { return self }
 var newStr = self
 let entities = [
 "&quot;" : "\"",
 "&amp;" : "&",
 "&apos;" : "'",
 "&lt;" : "<",
 "&gt;" : ">",
 ]
 for (name,value) in entities {
 newStr = newStr.stringByReplacingOccurrencesOfString(name, withString: value)
 }
 return newStr
 }
}

Examples used:

let encoded = "this is so &quot;good&quot;"
let decoded = encoded.htmlDecoded() // "this is so "good""

OR

let encoded = "this is so &quot;good&quot;".htmlDecoded() // "this is so "good""
answered Oct 27, 2015 at 16:50

1 Comment

I don't quite like this but I did not find anything better yet so this is an updated version of Michael Waterfall solution for Swift 2.0 gist.github.com/jrmgx/3f9f1d330b295cf6b1c6
4

Elegant Swift 4 Solution

If you want a string,

myString = String(htmlString: encodedString)

add this extension to your project:

extension String {
 init(htmlString: String) {
 self.init()
 guard let encodedData = htmlString.data(using: .utf8) else {
 self = htmlString
 return
 }
 let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
 ]
 do {
 let attributedString = try NSAttributedString(data: encodedData,
 options: attributedOptions,
 documentAttributes: nil)
 self = attributedString.string
 } catch {
 print("Error: \(error.localizedDescription)")
 self = htmlString
 }
 }
}

If you want an NSAttributedString with bold, italic, links, etc.,

textField.attributedText = try? NSAttributedString(htmlString: encodedString)

add this extension to your project:

extension NSAttributedString {
 convenience init(htmlString html: String) throws {
 try self.init(data: Data(html.utf8), options: [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
 ], documentAttributes: nil)
 }
}
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Apr 30, 2018 at 10:21

Comments

3

Swift 5.1 Version

import UIKit
extension String {
 init(htmlEncodedString: String) {
 self.init()
 guard let encodedData = htmlEncodedString.data(using: .utf8) else {
 self = htmlEncodedString
 return
 }
 let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
 ]
 do {
 let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
 self = attributedString.string
 } 
 catch {
 print("Error: \(error)")
 self = htmlEncodedString
 }
 }
}

Also, if you want to extract date, images, metadata, title and description, you can use my pod named:

][1].

Readability kit

Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Dec 16, 2019 at 4:50

2 Comments

What is it that wouldn't make it work in some prior versions, Swift 5.0, Swift 4.1, Swift 4.0, etc.?
I found an error when decode string using collectionViews
2

Swift 4

I really like the solution using documentAttributes. However, it is may too slow for parsing files and/or usage in table view cells. I can't believe that Apple does not provide a decent solution for this.

As a workaround, I found this String Extension on GitHub which works perfectly and is fast for decoding.

So for situations in which the given answer is to slow, see the solution suggest in this link: https://gist.github.com/mwaterfall/25b4a6a06dc3309d9555

Note: it does not parse HTML tags.

answered Dec 10, 2018 at 16:56

Comments

2

Computed var version of @yishus' answer

public extension String {
 /// Decodes string with HTML encoding.
 var htmlDecoded: String {
 guard let encodedData = self.data(using: .utf8) else { return self }
 let attributedOptions: [String : Any] = [
 NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
 NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue]
 do {
 let attributedString = try NSAttributedString(data: encodedData,
 options: attributedOptions,
 documentAttributes: nil)
 return attributedString.string
 } catch {
 print("Error: \(error)")
 return self
 }
 }
}
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Feb 23, 2017 at 8:09

Comments

2

Have a look at HTMLString - a library written in Swift that allows your program to add and remove HTML entities in Strings

For completeness, I copied the main features from the site:

  • Adds entities for ASCII and UTF-8/UTF-16 encodings
  • Removes more than 2100 named entities (like &)
  • Supports removing decimal and hexadecimal entities
  • Designed to support Swift Extended Grapheme Clusters (→ 100% emoji-proof)
  • Fully unit tested
  • Fast
  • Documented
  • Compatible with Objective-C
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Mar 22, 2018 at 10:21

1 Comment

Also very interesting, thanks! Should be way more up
1

Swift 4

func decodeHTML(string: String) -> String? {
 var decodedString: String?
 if let encodedData = string.data(using: .utf8) {
 let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
 ]
 do {
 decodedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil).string
 } catch {
 print("\(error.localizedDescription)")
 }
 }
 return decodedString
}
answered Aug 8, 2018 at 14:10

1 Comment

An explanation would be in order. For example, how is it different from previous Swift 4 answers?
1

Swift 4.1 +

var htmlDecoded: String {
 let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
 NSAttributedString.DocumentReadingOptionKey.documentType : NSAttributedString.DocumentType.html,
 NSAttributedString.DocumentReadingOptionKey.characterEncoding : String.Encoding.utf8.rawValue
 ]
 let decoded = try? NSAttributedString(data: Data(utf8), options: attributedOptions
 , documentAttributes: nil).string
 return decoded ?? self
} 
answered Oct 29, 2018 at 8:52

1 Comment

An explanation would be in order. For example, how is it different from previous answers? What Swift 4.1 features are used? Does it only work in Swift 4.1 and not in previous versions? Or would it work prior to Swift 4.1, say in Swift 4.0?
1

Swift 4

extension String {
 var replacingHTMLEntities: String? {
 do {
 return try NSAttributedString(data: Data(utf8), options: [
 .documentType: NSAttributedString.DocumentType.html,
 .characterEncoding: String.Encoding.utf8.rawValue
 ], documentAttributes: nil).string
 } catch {
 return nil
 }
 }
}

Simple Usage

let clean = "Weeknd &#8216;King Of The Fall&#8217".replacingHTMLEntities ?? "default value"
Nischal Hada
3,2983 gold badges30 silver badges57 bronze badges
answered Nov 4, 2017 at 16:02

2 Comments

I can already hear people complaining about my force unwrapped optional. If you are researching HTML string encoding and you do not know how to deal with Swift optionals, you're too far ahead of yourself.
yup, there is was (edited Nov 1 at 22:37 and made the "Simple Usage" much harder to comprehend)
1

Updated answer working on Swift 3

extension String {
 init?(htmlEncodedString: String) {
 let encodedData = htmlEncodedString.data(using: String.Encoding.utf8)!
 let attributedOptions = [ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType]
 guard let attributedString = try? NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil) else {
 return nil
 }
 self.init(attributedString.string)
 }
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Feb 10, 2017 at 11:15

Comments

0

Objective-C

+(NSString *) decodeHTMLEnocdedString:(NSString *)htmlEncodedString {
 if (!htmlEncodedString) {
 return nil;
 }
 NSData *data = [htmlEncodedString dataUsingEncoding:NSUTF8StringEncoding];
 NSDictionary *attributes = @{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
 NSCharacterEncodingDocumentAttribute: @(NSUTF8StringEncoding)};
 NSAttributedString *attributedString = [[NSAttributedString alloc] initWithData:data options:attributes documentAttributes:nil error:nil];
 return [attributedString string];
}
answered Jan 8, 2019 at 10:12

Comments

0

Swift 3.0 version with actual font size conversion

Normally, if you directly convert HTML content to an attributed string, the font size is increased. You can try to convert an HTML string to an attributed string and back again to see the difference.

Instead, here is the actual size conversion that makes sure the font size does not change, by applying the 0.75 ratio on all fonts:

extension String {
 func htmlAttributedString() -> NSAttributedString? {
 guard let data = self.data(using: String.Encoding.utf16, allowLossyConversion: false) else { return nil }
 guard let attriStr = try? NSMutableAttributedString(
 data: data,
 options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],
 documentAttributes: nil) else { return nil }
 attriStr.beginEditing()
 attriStr.enumerateAttribute(NSFontAttributeName, in: NSMakeRange(0, attriStr.length), options: .init(rawValue: 0)) {
 (value, range, stop) in
 if let font = value as? UIFont {
 let resizedFont = font.withSize(font.pointSize * 0.75)
 attriStr.addAttribute(NSFontAttributeName,
 value: resizedFont,
 range: range)
 }
 }
 attriStr.endEditing()
 return attriStr
 }
}
answered Jul 15, 2017 at 2:44

Comments

0

Swift 4

extension String {
 mutating func toHtmlEncodedString() {
 guard let encodedData = self.data(using: .utf8) else {
 return
 }
 let attributedOptions: [NSAttributedString.DocumentReadingOptionKey : Any] = [
 NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.documentType.rawValue): NSAttributedString.DocumentType.html,
 NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.characterEncoding.rawValue): String.Encoding.utf8.rawValue
 ]
 do {
 let attributedString = try NSAttributedString(data: encodedData, options: attributedOptions, documentAttributes: nil)
 self = attributedString.string
 }
 catch {
 print("Error: \(error)")
 }
 }
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Nov 5, 2017 at 8:32

2 Comments

Please, the rawValue syntax NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.documentType.rawValue) and NSAttributedString.DocumentReadingOptionKey(rawValue: NSAttributedString.DocumentAttributeKey.characterEncoding.rawValue) is horrible. Replace it with .documentType and .characterEncoding
Performance of this solution is horrible. It is maybe okay for separate caes, parsing files is not advised.
-1

Use:

NSData dataRes = (nsdata value )
var resString = NSString(data: dataRes, encoding: NSUTF8StringEncoding)
Peter Mortensen
31.3k22 gold badges110 silver badges134 bronze badges
answered Mar 3, 2016 at 6:40

1 Comment

An explanation would be in order (by editing your answer, not here in comments).

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.