1//===- FunctionInfo.h -------------------------------------------*- C++ -*-===//
3// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4// See https://llvm.org/LICENSE.txt for license information.
5// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
7//===----------------------------------------------------------------------===//
9#ifndef LLVM_DEBUGINFO_GSYM_FUNCTIONINFO_H
10#define LLVM_DEBUGINFO_GSYM_FUNCTIONINFO_H
29/// Function information in GSYM files encodes information for one contiguous
30/// address range. If a function has discontiguous address ranges, they will
31/// need to be encoded using multiple FunctionInfo objects.
35/// The function information gets the function start address as an argument
36/// to the FunctionInfo::decode(...) function. This information is calculated
37/// from the GSYM header and an address offset from the GSYM address offsets
38/// table. The encoded FunctionInfo information must be aligned to a 4 byte
41/// The encoded data for a FunctionInfo starts with fixed data that all
42/// function info objects have:
44/// ENCODING NAME DESCRIPTION
45/// ========= =========== ====================================================
46/// uint32_t Size The size in bytes of this function.
47/// uint32_t Name The string table offset of the function name.
49/// The optional data in a FunctionInfo object follows this fixed information
50/// and consists of a stream of tuples that consist of:
52/// ENCODING NAME DESCRIPTION
53/// ========= =========== ====================================================
54/// uint32_t InfoType An "InfoType" enumeration that describes the type
55/// of optional data that is encoded.
56/// uint32_t InfoLength The size in bytes of the encoded data that
57/// immediately follows this length if this value is
59/// uint8_t[] InfoData Encoded bytes that represent the data for the
60/// "InfoType". These bytes are only present if
61/// "InfoLength" is greater than zero.
63/// The "InfoType" is an enumeration:
67/// LineTableInfo = 1u,
69/// MergedFunctionsInfo = 3u,
73/// This stream of tuples is terminated by a "InfoType" whose value is
74/// InfoType::EndOfList and a zero for "InfoLength". This signifies the end of
75/// the optional information list. This format allows us to add new optional
76/// information data to a FunctionInfo object over time and allows older
77/// clients to still parse the format and skip over any data that they don't
78/// understand or want to parse.
80/// So the function information encoding essentially looks like:
87/// uint32_t InfoLength;
88/// uint8_t InfoData[InfoLength];
92/// Where "N" is the number of tuples.
100 /// If we encode a FunctionInfo during segmenting so we know its size, we can
101 /// cache that encoding here so we don't need to re-encode it when saving the
108 /// Query if a FunctionInfo has rich debug info.
110 /// \returns A bool that indicates if this object has something else than
111 /// range and name. When converting information from a symbol table and from
112 /// debug info, we might end up with multiple FunctionInfo objects for the
113 /// same range and we need to be able to tell which one is the better object
117 /// Query if a FunctionInfo object is valid.
119 /// Address and size can be zero and there can be no line entries for a
120 /// symbol so the only indication this entry is valid is if the name is
121 /// not zero. This can happen when extracting information from symbol
122 /// tables that do not encode symbol sizes. In that case only the
123 /// address and name will be filled in.
125 /// \returns A boolean indicating if this FunctionInfo is valid.
130 /// Decode an object from a binary data stream.
132 /// \param Data The binary stream to read the data from. This object must
133 /// have the data for the object starting at offset zero. The data
134 /// can contain more data than needed.
136 /// \param BaseAddr The FunctionInfo's start address and will be used as the
137 /// base address when decoding any contained information like the line table
138 /// and the inline info.
140 /// \returns An FunctionInfo or an error describing the issue that was
141 /// encountered during decoding.
145 /// Encode this object into FileWriter stream.
147 /// \param O The binary stream to write the data to at the current file
150 /// \param NoPadding Directly write the FunctionInfo data, without any padding
151 /// By default, FunctionInfo will be 4-byte aligned by padding with
152 /// 0's at the start. This is OK since the function will return the offset of
153 /// actual data in the stream. However when writing FunctionInfo's as a
154 /// stream, the padding will break the decoding of the data - since the offset
155 /// where the FunctionInfo starts is not kept in this scenario.
157 /// \returns An error object that indicates failure or the offset of the
158 /// function info that was successfully written into the stream.
160 bool NoPadding =
false)
const;
162 /// Encode this function info into the internal byte cache and return the size
165 /// When segmenting GSYM files we need to know how big each FunctionInfo will
166 /// encode into so we can generate segments of the right size. We don't want
167 /// to have to encode a FunctionInfo twice, so we can cache the encoded bytes
168 /// and re-use then when calling FunctionInfo::encode(...).
170 /// \returns The size in bytes of the FunctionInfo if it were to be encoded
171 /// into a byte stream.
174 /// Lookup an address within a FunctionInfo object's data stream.
176 /// Instead of decoding an entire FunctionInfo object when doing lookups,
177 /// we can decode only the information we need from the FunctionInfo's data
178 /// for the specific address. The lookup result information is returned as
181 /// \param Data The binary stream to read the data from. This object must
182 /// have the data for the object starting at offset zero. The data
183 /// can contain more data than needed.
185 /// \param GR The GSYM reader that contains the string and file table that
186 /// will be used to fill in information in the returned result.
188 /// \param FuncAddr The function start address decoded from the GsymReader.
190 /// \param Addr The address to lookup.
192 /// \param MergedFuncsData A pointer to an optional DataExtractor that, if
193 /// non-null, will be set to the raw data of the MergedFunctionInfo, if
196 /// \returns An LookupResult or an error describing the issue that was
197 /// encountered during decoding. An error should only be returned if the
198 /// address is not contained in the FunctionInfo or if the data is corrupted.
202 std::optional<DataExtractor> *MergedFuncsData =
nullptr);
217 return LHS.Range ==
RHS.Range &&
LHS.Name ==
RHS.Name &&
218 LHS.OptLineTable ==
RHS.OptLineTable &&
LHS.Inline ==
RHS.Inline;
223/// This sorting will order things consistently by address range first, but
224/// then followed by increasing levels of debug info like inline information
225/// and line tables. We might end up with a FunctionInfo from debug info that
226/// will have the same range as one from the symbol table, but we want to
227/// quickly be able to sort and use the best version when creating the final
228/// GSYM file. This function compares the inline information as we have seen
229/// cases where LTO can generate a wide array of differing inline information,
230/// mostly due to messing up the address ranges for inlined functions, so the
231/// inline information with the most entries will appeear last. If the inline
232/// information match, either by both function infos not having any or both
233/// being exactly the same, we will then compare line tables. Comparing line
234/// tables allows the entry with the most line entries to appear last. This
235/// ensures we are able to save the FunctionInfo with the most debug info into
238 // First sort by address range
239 return std::tie(
LHS.Range,
LHS.Inline,
LHS.OptLineTable) <
240 std::tie(
RHS.Range,
RHS.Inline,
RHS.OptLineTable);
248#endif // LLVM_DEBUGINFO_GSYM_FUNCTIONINFO_H
This file defines the SmallString class.
A class that represents an address range.
Tagged union holding either a T or a Error.
SmallString - A SmallString is just a SmallVector with methods and accessors that make it work better...
A simplified binary data writer class that doesn't require targets, target definitions,...
GsymReader is used to read GSYM data from a file or buffer.
This class implements an extremely fast bulk output stream that can only output to a stream.
LLVM_ABI raw_ostream & operator<<(raw_ostream &OS, const CallSiteInfo &CSI)
bool operator<(const FunctionInfo &LHS, const FunctionInfo &RHS)
This sorting will order things consistently by address range first, but then followed by increasing l...
bool operator==(const FunctionInfo &LHS, const FunctionInfo &RHS)
bool operator!=(const FunctionInfo &LHS, const FunctionInfo &RHS)
This is an optimization pass for GlobalISel generic memory operations.
FunctionAddr VTableAddr uintptr_t uintptr_t Data
Function information in GSYM files encodes information for one contiguous address range.
std::optional< InlineInfo > Inline
std::optional< MergedFunctionsInfo > MergedFunctions
uint64_t startAddress() const
uint64_t endAddress() const
bool isValid() const
Query if a FunctionInfo object is valid.
std::optional< CallSiteInfoCollection > CallSites
bool hasRichInfo() const
Query if a FunctionInfo has rich debug info.
static LLVM_ABI llvm::Expected< LookupResult > lookup(DataExtractor &Data, const GsymReader &GR, uint64_t FuncAddr, uint64_t Addr, std::optional< DataExtractor > *MergedFuncsData=nullptr)
Lookup an address within a FunctionInfo object's data stream.
LLVM_ABI uint64_t cacheEncoding()
Encode this function info into the internal byte cache and return the size in bytes.
uint32_t Name
String table offset in the string table.
LLVM_ABI llvm::Expected< uint64_t > encode(FileWriter &O, bool NoPadding=false) const
Encode this object into FileWriter stream.
SmallString< 32 > EncodingCache
If we encode a FunctionInfo during segmenting so we know its size, we can cache that encoding here so...
std::optional< LineTable > OptLineTable
FunctionInfo(uint64_t Addr=0, uint64_t Size=0, uint32_t N=0)
static LLVM_ABI llvm::Expected< FunctionInfo > decode(DataExtractor &Data, uint64_t BaseAddr)
Decode an object from a binary data stream.