解决字节数据解码问题

学习笔记作者：admin日期：2025-05-25点击：62

摘要：针对字节数据无法用UTF-8解码的问题，提供了使用UTF-16解码的方法，并给出了完整的代码示例。

问题描述

尝试使用UTF-8解码字节数据时出现错误：'utf-8' codec can't decode byte 0xff in position 0: invalid start byte'。

原因分析

原始字节数据包含非法的UTF-8字节序列，如b'\xff\xff\xff\xfe'，表明数据可能是UTF-16编码。

解决方案

推荐使用utf-16le或utf-16解码：

items = b'\xff\xff\xff\xfe\x00\x00\x00\x03\x00\x00\x00\x02\x002\x00\x00\x00\x00\x16T: 2024-12-19 08:24:50\x00\x00\x00\x02\x004\x00\x00\x00\x00\x0b1734567890\x00\x00\x00\x00\x1e\x004\x00;\x00c\x00h\x00a\x00r\x00s\x00e\x00t\x00=\x00u\x00t\x00f\x00-\x008\x00\x00\x00\x00\n1734567890';

# 使用utf-16le解码
try:
    decoded = items.decode('utf-16le')
    print("Decoded string:")
    print(decoded)
except Exception as e:
    print(f"Decode error: {e}")

输出结果

成功解码后可提取出时间、ID、charset等关键信息。

关键词

UTF-8, UTF-16, 字节解码, 解码错误

UTF8 UTF16 字节解码解码错误