MailMimeParser 1.1

SubjectConsumer extends GenericConsumer

Extends GenericConsumer to remove its sub consumers.

Prior to this, subject headers were parsed using the GenericConsumer which meant if the subject contained text within parentheses, it would not be included as part of the returned value in a getHeaderValue. Mime-encoded parts within quotes would be ignored, and backslash characters denoted an escaped character.

From testing in ThunderBird and Outlook web mail it seems quoting parts doesn't have an effect (e.g. quoting a "mime-literal" encoded part still comes out decoded), and parts in parentheses (comments) are displayed normally.

Tags
author

Zaahid Bateson

Table of Contents

$consumerService  : ConsumerService
$partFactory  : HeaderPartFactory
__construct()  : mixed
Initializes the instance.
__invoke()  : array<string|int, HeaderPart>
Invokes parsing of a header's value into header parts.
getInstance()  : mixed
Returns the singleton instance for the class.
advanceToNextToken()  : mixed
Determines if the iterator should be advanced to the next token after reading tokens or finding a start token.
filterIgnoredSpaces()  : array<string|int, HeaderPart>
Filters out ignorable spaces between parts in the passed array.
getAllConsumers()  : array<string|int, AbstractConsumer>
Returns this consumer and all unique sub consumers.
getAllTokenSeparators()  : array<string|int, string>
Returns a list of regular expression markers for this consumer and all sub-consumers by calling 'getTokenSeparators'.
getConsumerTokenParts()  : array<string|int, HeaderPart>|array<string|int, mixed>
Iterates through this consumer's sub-consumers checking if the current token triggers a sub-consumer's start token and passes control onto that sub-consumer's parseTokenIntoParts. If no sub-consumer is responsible for the current token, calls getPartForToken and returns it in an array.
getPartForToken()  : HeaderPart|null
Constructs and returns a \ZBateson\MailMimeParser\Header\Part\HeaderPart for the passed string token. If the token should be ignored, the function must return null.
getSubConsumers()  : array<string|int, AbstractConsumer>
Returns an empty array
getTokenParts()  : array<string|int, HeaderPart>|array<string|int, mixed>
Returns an array of \ZBateson\MailMimeParser\Header\Part\HeaderPart for the current token on the iterator.
getTokenSeparators()  : array<string|int, string>
Returns an array of regular expression separators specific to this consumer. The returned patterns are used to split the header value into tokens for the consumer to parse into parts.
getTokenSplitPattern()  : string
Overridden to not split out backslash characters and its next character as a special case defined in AbastractConsumer
isEndToken()  : bool
Returns true if the passed string token marks the end marker for the current consumer.
isStartToken()  : bool
Returns true if the passed string token marks the beginning marker for the current consumer.
parseTokensIntoParts()  : array<string|int, HeaderPart>
Iterates over the passed token Iterator and returns an array of parsed \ZBateson\MailMimeParser\Header\Part\HeaderPart objects.
processParts()  : array<string|int, HeaderPart>
Performs any final processing on the array of parsed parts before returning it to the consumer client.
splitRawValue()  : array<string|int, mixed>
Returns an array of split tokens from the input string.
addSpaces()  : mixed
Checks if the passed space part should be added to the returned parts and adds it.
addSpaceToRetParts()  : mixed
Loops over the $parts array from the current position, checks if the space should be added, then adds it to $retParts and returns.
isSpaceToken()  : bool
Returns true if the passed HeaderPart is a Token instance and a space.
parseRawValue()  : array<string|int, HeaderPart>
Called by __invoke to parse the raw header value into header parts.
shouldAddSpace()  : bool
Returns true if a space should be added based on the passed last and next parts.

Properties

Methods

__invoke()

Invokes parsing of a header's value into header parts.

public __invoke(string $value) : array<string|int, HeaderPart>
Parameters
$value : string

the raw header value

Return values
array<string|int, HeaderPart>

the array of parsed parts

getInstance()

Returns the singleton instance for the class.

public static getInstance(ConsumerService $consumerService, HeaderPartFactory $partFactory) : mixed
Parameters
$consumerService : ConsumerService
$partFactory : HeaderPartFactory
Return values
mixed

advanceToNextToken()

Determines if the iterator should be advanced to the next token after reading tokens or finding a start token.

protected advanceToNextToken(Iterator $tokens, bool $isStartToken) : mixed

The default implementation will advance for a start token, but not advance on the end token of the current consumer, allowing the end token to be passed up to a higher-level consumer.

Parameters
$tokens : Iterator
$isStartToken : bool
Return values
mixed

filterIgnoredSpaces()

Filters out ignorable spaces between parts in the passed array.

protected filterIgnoredSpaces(array<string|int, HeaderPart$parts) : array<string|int, HeaderPart>

Spaces with parts on either side of it that specify they can be ignored are filtered out. filterIgnoredSpaces is called from within processParts, and if needed by an implementing class that overrides processParts, must be specifically called.

Parameters
$parts : array<string|int, HeaderPart>
Return values
array<string|int, HeaderPart>

getAllConsumers()

Returns this consumer and all unique sub consumers.

protected getAllConsumers() : array<string|int, AbstractConsumer>

Loops into the sub-consumers (and their sub-consumers, etc...) finding all unique consumers, and returns them in an array.

Return values
array<string|int, AbstractConsumer>

getAllTokenSeparators()

Returns a list of regular expression markers for this consumer and all sub-consumers by calling 'getTokenSeparators'.

protected getAllTokenSeparators() : array<string|int, string>

.

Return values
array<string|int, string>

an array of regular expression markers

getConsumerTokenParts()

Iterates through this consumer's sub-consumers checking if the current token triggers a sub-consumer's start token and passes control onto that sub-consumer's parseTokenIntoParts. If no sub-consumer is responsible for the current token, calls getPartForToken and returns it in an array.

protected getConsumerTokenParts(Iterator $tokens) : array<string|int, HeaderPart>|array<string|int, mixed>
Parameters
$tokens : Iterator
Return values
array<string|int, HeaderPart>|array<string|int, mixed>

getPartForToken()

Constructs and returns a \ZBateson\MailMimeParser\Header\Part\HeaderPart for the passed string token. If the token should be ignored, the function must return null.

protected getPartForToken(string $token, bool $isLiteral) : HeaderPart|null

The default created part uses the instance's partFactory->newInstance method.

Parameters
$token : string

the token

$isLiteral : bool

set to true if the token represents a literal - e.g. an escaped token

Return values
HeaderPart|null

the constructed header part or null if the token should be ignored

getTokenParts()

Returns an array of \ZBateson\MailMimeParser\Header\Part\HeaderPart for the current token on the iterator.

protected getTokenParts(Iterator $tokens) : array<string|int, HeaderPart>|array<string|int, mixed>

Overridden from AbstractConsumer to remove special filtering for backslash escaping, which also seems to not apply to Subject headers at least in ThunderBird's implementation.

Parameters
$tokens : Iterator
Return values
array<string|int, HeaderPart>|array<string|int, mixed>

getTokenSeparators()

Returns an array of regular expression separators specific to this consumer. The returned patterns are used to split the header value into tokens for the consumer to parse into parts.

protected abstract getTokenSeparators() : array<string|int, string>

Each array element makes part of a generated regular expression that is used in a call to preg_split(). RegEx patterns can be used, and care should be taken to escape special characters.

Return values
array<string|int, string>

the array of patterns

getTokenSplitPattern()

Overridden to not split out backslash characters and its next character as a special case defined in AbastractConsumer

protected getTokenSplitPattern() : string
Return values
string

the regex pattern

isEndToken()

Returns true if the passed string token marks the end marker for the current consumer.

protected abstract isEndToken(string $token) : bool
Parameters
$token : string

the current token

Return values
bool

isStartToken()

Returns true if the passed string token marks the beginning marker for the current consumer.

protected abstract isStartToken(string $token) : bool
Parameters
$token : string

the current token

Return values
bool

parseTokensIntoParts()

Iterates over the passed token Iterator and returns an array of parsed \ZBateson\MailMimeParser\Header\Part\HeaderPart objects.

protected parseTokensIntoParts(Iterator $tokens) : array<string|int, HeaderPart>

The method checks each token to see if the token matches a sub-consumer's start token, or if it matches the current consumer's end token to stop processing.

If a sub-consumer's start token is matched, the sub-consumer is invoked and its returned parts are merged to the current consumer's header parts.

After all tokens are read and an array of Header\Parts are constructed, the array is passed to AbstractConsumer::processParts for any final processing.

Parameters
$tokens : Iterator

an iterator over a string of tokens

Return values
array<string|int, HeaderPart>

an array of parsed parts

processParts()

Performs any final processing on the array of parsed parts before returning it to the consumer client.

protected processParts(array<string|int, HeaderPart$parts) : array<string|int, HeaderPart>

The default implementation simply returns the passed array after filtering out null/empty parts.

Parameters
$parts : array<string|int, HeaderPart>
Return values
array<string|int, HeaderPart>

splitRawValue()

Returns an array of split tokens from the input string.

protected splitRawValue(string $rawValue) : array<string|int, mixed>

The method calls preg_split using getTokenSplitPattern. The split array will not contain any empty parts and will contain the markers.

Parameters
$rawValue : string

the raw string

Return values
array<string|int, mixed>

the array of tokens

addSpaces()

Checks if the passed space part should be added to the returned parts and adds it.

private addSpaces(array<string|int, HeaderPart$parts, array<string|int, HeaderPart&$retParts, int $curIndex[, HeaderPart &$spacePart = null ]) : mixed

Never adds a space if it's the first part, otherwise only add it if either part isn't set to ignore the space

Parameters
$parts : array<string|int, HeaderPart>
$retParts : array<string|int, HeaderPart>
$curIndex : int
$spacePart : HeaderPart = null
Return values
mixed

addSpaceToRetParts()

Loops over the $parts array from the current position, checks if the space should be added, then adds it to $retParts and returns.

private addSpaceToRetParts(array<string|int, HeaderPart$parts, array<string|int, HeaderPart&$retParts, int $curIndex, HeaderPart &$spacePart, HeaderPart $lastPart) : mixed
Parameters
$parts : array<string|int, HeaderPart>
$retParts : array<string|int, HeaderPart>
$curIndex : int
$spacePart : HeaderPart
$lastPart : HeaderPart
Return values
mixed

isSpaceToken()

Returns true if the passed HeaderPart is a Token instance and a space.

private isSpaceToken(HeaderPart $part) : bool
Parameters
$part : HeaderPart
Return values
bool

parseRawValue()

Called by __invoke to parse the raw header value into header parts.

private parseRawValue(string $value) : array<string|int, HeaderPart>

Calls splitTokens to split the value into token part strings, then calls parseParts to parse the returned array.

Parameters
$value : string
Return values
array<string|int, HeaderPart>

the array of parsed parts

Search results