|
1 | 1 | ## json
|
2 | 2 |
|
3 | | -python这么强大的语言当然也可以用来处理json,两个主要的函数是`json.dumps()`和`json.loads()`分别用来将dist字典格式的Python数据编码为json数据格式,和将json数据格式解码为Python的数据格式。 |
| 3 | +python这么强大的语言当然也可以用来处理json,两个主要的函数是`json.dumps()`和`json.loads()`分别用来将dist字典格式的Python数据编码为json数据格式字符串,和将json数据格式字符串解码为Python的数据格式。 |
4 | 4 |
|
5 | 5 | > 还有 ujson 更快,simplejson 兼容性更强
|
6 | 6 |
|
7 | 7 | 分别有四个主要的函数
|
8 | 8 |
|
9 | 9 | ```
|
10 | | -dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, encoding='utf-8', default=None, sort_keys=False, **kw) # 将 json 转换为字符串并存储到文件中 |
11 | | -dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, encoding='utf-8', default=None, sort_keys=False, **kw) # 将 json 转换为字符串 |
12 | | -load(fp, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw) # 从文件中读取字符串并转换为 json |
13 | | -loads(s, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw) # 将字符串转换为 json |
| 10 | +# 将 python 的数据格式转换为 json 字符串并存储到文件中 |
| 11 | +dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, encoding='utf-8', default=None, sort_keys=False, **kw) |
| 12 | +# 将 python 的数据格式转换为 json 字符串 |
| 13 | +dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, encoding='utf-8', default=None, sort_keys=False, **kw) |
| 14 | +# 从文件中读取 json 字符串并转换为python 的数据格式 |
| 15 | +load(fp, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw) |
| 16 | +# 将 json 字符串转换为 python 的数据格式 |
| 17 | +loads(s, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw) |
14 | 18 | ```
|
15 | 19 |
|
16 | 20 | ```python
|
@@ -160,5 +164,122 @@ json 和 dict 还有两个地方不一样
|
160 | 164 |
|
161 | 165 | 2018年06月21日
|
162 | 166 |
|
163 | | -- `` 能够对格式化对象进行一个简单的压缩,取消空格 |
164 | | -- `json.dumps(obj, separators=(',',':'), ensure_ascii=False)` 能够输出 utf-8 格式的中文即可见的中文,而非 Unicode 格式的中文 `\uXXXX` |
| 167 | +- `json.dumps(obj, indent=4)` 能够输出一个格式化的字符串,有换行有缩进。 |
| 168 | +- `json.dumps(obj, separators=(',',':'))` 能够对输出字符串进行一个简单的压缩,取消空格.因为默认是 `(', ', ': ')` |
| 169 | +- `json.dumps(obj, ensure_ascii=False)` 能够输出 utf-8 格式的中文即可见的中文,而非 Unicode 格式的中文 `\uXXXX` |
| 170 | + |
| 171 | +2020年09月09日 |
| 172 | + |
| 173 | +正常的 json 字符串像这样 `{"price": 542.23, "name": "ACME", "shares": 100, "others": ["first thing", "second thing", "third thing"]}` 都是没问题的,但是如果在 json 对象中,key 或者 value 里存在控制字符,就会出现 `Invalid Control Character` 的 `ValueError`。 |
| 174 | + |
| 175 | +**什么是控制字符?** |
| 176 | +ACSII 码表,排名前三十二位和最后一位的字符就是控制字符,包括 `\t`, `\n`, `\r` 等。 |
| 177 | + |
| 178 | +[ASCII码一览表](http://c.biancheng.net/c/ascii/) |
| 179 | + |
| 180 | +**出现控制字符怎么办?** |
| 181 | + |
| 182 | +比如这样的 json 字符串 `'{"price": 542.23, "name": "ACME", "sh\rares": 100, "others": ["first thing", "second\t thing", "third\n thing"]}'` |
| 183 | + |
| 184 | +不要惊慌,在解析的时候,传入参数 `strict=False` 即可。 |
| 185 | + |
| 186 | +``` |
| 187 | +In [28]: s = '{"price": 542.23, "name": "ACME", "shares": 100, "others": ["first thing", "second thing", "third thing"]}' |
| 188 | + |
| 189 | +In [29]: json.loads(s) |
| 190 | +Out[29]: |
| 191 | +{u'name': u'ACME', |
| 192 | + u'others': [u'first thing', u'second thing', u'third thing'], |
| 193 | + u'price': 542.23, |
| 194 | + u'shares': 100} |
| 195 | + |
| 196 | +In [30]: s = '{"price": 542.23, "name": "ACME", "sh\rares": 100, "others": ["first thing", "second\t thing", "third\n thing"]}' |
| 197 | + |
| 198 | +In [31]: json.loads(s) |
| 199 | +--------------------------------------------------------------------------- |
| 200 | +ValueError Traceback (most recent call last) |
| 201 | +<ipython-input-31-48280973ea66> in <module>() |
| 202 | +----> 1 json.loads(s) |
| 203 | + |
| 204 | +/Users/bytedance/miniconda/envs/byted/lib/python2.7/json/__init__.pyc in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) |
| 205 | + 337 parse_int is None and parse_float is None and |
| 206 | + 338 parse_constant is None and object_pairs_hook is None and not kw): |
| 207 | +--> 339 return _default_decoder.decode(s) |
| 208 | + 340 if cls is None: |
| 209 | + 341 cls = JSONDecoder |
| 210 | + |
| 211 | +/Users/bytedance/miniconda/envs/byted/lib/python2.7/json/decoder.pyc in decode(self, s, _w) |
| 212 | + 362 |
| 213 | + 363 """ |
| 214 | +--> 364 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) |
| 215 | + 365 end = _w(s, end).end() |
| 216 | + 366 if end != len(s): |
| 217 | + |
| 218 | +/Users/bytedance/miniconda/envs/byted/lib/python2.7/json/decoder.pyc in raw_decode(self, s, idx) |
| 219 | + 378 """ |
| 220 | + 379 try: |
| 221 | +--> 380 obj, end = self.scan_once(s, idx) |
| 222 | + 381 except StopIteration: |
| 223 | + 382 raise ValueError("No JSON object could be decoded") |
| 224 | + |
| 225 | +ValueError: Invalid control character at: line 1 column 38 (char 37) |
| 226 | + |
| 227 | +In [32]: json.loads(s, strict=False) |
| 228 | +Out[32]: |
| 229 | +{u'name': u'ACME', |
| 230 | + u'others': [u'first thing', u'second\t thing', u'third\n thing'], |
| 231 | + u'price': 542.23, |
| 232 | + u'sh\rares': 100} |
| 233 | +``` |
| 234 | + |
| 235 | +**还需要注意两点** |
| 236 | +1. 如果不是在 json 字符串的字符串类型中有控制字符,是可以正常解析的,在 json 的两个 key 之间是可以有正常的换行符,比如这样的字符串 `'\n{"price": 542.23,\n "name": "ACME", \t"shares": 100, "others": ["first thing", "second thing",\n "third thing"]}'` |
| 237 | +2. 如果不是手动换行符,而是出现了换行,也是一样的换行符,主要是在 json 的每个元素里,不能有换行符。 |
| 238 | + |
| 239 | +``` |
| 240 | +In [34]: s = '\n{"price": 542.23,\n "name": "ACME", \t"shares": 100, "others": ["first thing", "second thing",\n "third thing"]}' |
| 241 | + |
| 242 | +In [35]: json.loads(s) |
| 243 | +Out[35]: |
| 244 | +{u'name': u'ACME', |
| 245 | + u'others': [u'first thing', u'second thing', u'third thing'], |
| 246 | + u'price': 542.23, |
| 247 | + u'shares': 100} |
| 248 | + |
| 249 | +In [37]: s= """{"price": 542.23, "name": "ACME", "shares": 100, "others": ["first thing", "second |
| 250 | + ...: thing", "third thing"]}""" |
| 251 | + |
| 252 | +In [38]: s |
| 253 | +Out[38]: '{"price": 542.23, "name": "ACME", "shares": 100, "others": ["first thing", "second \nthing", "third thing"]}' |
| 254 | + |
| 255 | +In [39]: json.loads(s) |
| 256 | +--------------------------------------------------------------------------- |
| 257 | +ValueError Traceback (most recent call last) |
| 258 | +<ipython-input-39-48280973ea66> in <module>() |
| 259 | +----> 1 json.loads(s) |
| 260 | + |
| 261 | +/Users/bytedance/miniconda/envs/byted/lib/python2.7/json/__init__.pyc in loads(s, encoding, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) |
| 262 | + 337 parse_int is None and parse_float is None and |
| 263 | + 338 parse_constant is None and object_pairs_hook is None and not kw): |
| 264 | +--> 339 return _default_decoder.decode(s) |
| 265 | + 340 if cls is None: |
| 266 | + 341 cls = JSONDecoder |
| 267 | + |
| 268 | +/Users/bytedance/miniconda/envs/byted/lib/python2.7/json/decoder.pyc in decode(self, s, _w) |
| 269 | + 362 |
| 270 | + 363 """ |
| 271 | +--> 364 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) |
| 272 | + 365 end = _w(s, end).end() |
| 273 | + 366 if end != len(s): |
| 274 | + |
| 275 | +/Users/bytedance/miniconda/envs/byted/lib/python2.7/json/decoder.pyc in raw_decode(self, s, idx) |
| 276 | + 378 """ |
| 277 | + 379 try: |
| 278 | +--> 380 obj, end = self.scan_once(s, idx) |
| 279 | + 381 except StopIteration: |
| 280 | + 382 raise ValueError("No JSON object could be decoded") |
| 281 | + |
| 282 | +ValueError: Invalid control character at: line 1 column 84 (char 83) |
| 283 | + |
| 284 | +``` |
| 285 | + |
0 commit comments