Modules¶
application
¶
Classes¶
Modules¶
application_service_registry
¶
Provide an application service registry.
Classes¶
ApplicationServiceRegistry
¶Define an application service registry.
Source code in src/taxpasta/infrastructure/application/application_service_registry.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 |
|
profile_reader(profiler: SupportedProfiler) -> Type[ProfileReader]
classmethod
¶Return a profile reader of the correct type.
Source code in src/taxpasta/infrastructure/application/application_service_registry.py
profile_standardisation_service(profiler: SupportedProfiler) -> Type[ProfileStandardisationService]
classmethod
¶Return a profile standardisation service of the correct type.
Source code in src/taxpasta/infrastructure/application/application_service_registry.py
standard_profile_writer(file_format: StandardProfileFileFormat) -> Type[StandardProfileWriter]
classmethod
¶Return a standard profile writer of the correct type.
Source code in src/taxpasta/infrastructure/application/application_service_registry.py
table_reader(file_format: TableReaderFileFormat) -> Type[TableReader]
classmethod
¶Return a table reader of the correct type.
Source code in src/taxpasta/infrastructure/application/application_service_registry.py
tidy_observation_table_writer(file_format: TidyObservationTableFileFormat) -> Type[TidyObservationTableWriter]
classmethod
¶Return a tidy table writer of the correct type.
Source code in src/taxpasta/infrastructure/application/application_service_registry.py
wide_observation_table_writer(file_format: WideObservationTableFileFormat) -> Type[WideObservationTableWriter]
classmethod
¶Return a writer for wide observation tables in the specified format.
Source code in src/taxpasta/infrastructure/application/application_service_registry.py
bracken
¶
Classes¶
Modules¶
bracken_profile
¶Provide a description of the Bracken profile format.
BRACKEN_FRACTION_TOLERANCE = 0.01
module-attribute
¶BRACKEN_FRACTION_TOTAL = 1.0
module-attribute
¶BrackenProfile
¶
Bases: BaseDataFrameModel
Define the expected Bracken profile format.
Source code in src/taxpasta/infrastructure/application/bracken/bracken_profile.py
added_reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶fraction_total_reads: Series[float] = pa.Field(ge=0.0, le=1.0)
class-attribute
instance-attribute
¶kraken_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶name: Series[str] = pa.Field()
class-attribute
instance-attribute
¶new_est_reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶taxonomy_id: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶taxonomy_lvl: Series[str] = pa.Field()
class-attribute
instance-attribute
¶check_added_reads_consistency(profile: DataFrame) -> Series[bool]
¶Check that Bracken added reads are consistent.
Source code in src/taxpasta/infrastructure/application/bracken/bracken_profile.py
check_compositionality(fraction_total_reads: Series[float]) -> bool
¶Check that the fractions of reads add up to one.
Source code in src/taxpasta/infrastructure/application/bracken/bracken_profile.py
bracken_profile_reader
¶Provide a reader for Bracken profiles.
BrackenProfileReader
¶
Bases: ProfileReader
Define a reader for Bracken profiles.
Source code in src/taxpasta/infrastructure/application/bracken/bracken_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[BrackenProfile]
classmethod
¶Read a Bracken taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by Bracken. |
required |
Returns:
Type | Description |
---|---|
DataFrame[BrackenProfile]
|
A data frame representation of the Bracken profile. |
Source code in src/taxpasta/infrastructure/application/bracken/bracken_profile_reader.py
bracken_profile_standardisation_service
¶Provide a standardisation service for Bracken profiles.
BrackenProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for Bracken profiles.
Source code in src/taxpasta/infrastructure/application/bracken/bracken_profile_standardisation_service.py
transform(profile: DataFrame[BrackenProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given Bracken profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[BrackenProfile]
|
A taxonomic profile generated by Bracken. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Raises:
Type | Description |
---|---|
SchemaErrors
|
If the given profile does not conform with the
|
Source code in src/taxpasta/infrastructure/application/bracken/bracken_profile_standardisation_service.py
centrifuge
¶
Classes¶
Modules¶
centrifuge_profile
¶Provide a description of the centrifuge profile format.
CENTRIFUGE_PERCENT_TOLERANCE = 1.0
module-attribute
¶CENTRIFUGE_PERCENT_TOTAL = 100.0
module-attribute
¶CentrifugeProfile
¶
Bases: BaseDataFrameModel
Define the expected centrifuge profile format.
Source code in src/taxpasta/infrastructure/application/centrifuge/centrifuge_profile.py
clade_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶direct_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶name: Series[str] = pa.Field()
class-attribute
instance-attribute
¶percent: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
instance-attribute
¶taxonomy_id: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶taxonomy_level: Series[str] = pa.Field()
class-attribute
instance-attribute
¶check_compositionality(percent: Series[float]) -> bool
¶Check that the percent of 'unclassified' and 'root' add up to a hundred.
Source code in src/taxpasta/infrastructure/application/centrifuge/centrifuge_profile.py
centrifuge_profile_reader
¶Provide a reader for Centrifuge profiles.
CentrifugeProfileReader
¶
Bases: ProfileReader
Define a reader for centrifuge profiles.
Source code in src/taxpasta/infrastructure/application/centrifuge/centrifuge_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[CentrifugeProfile]
classmethod
¶Read a centrifuge taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by centrifuge. |
required |
Returns:
Type | Description |
---|---|
DataFrame[CentrifugeProfile]
|
A data frame representation of the centrifuge profile. |
Source code in src/taxpasta/infrastructure/application/centrifuge/centrifuge_profile_reader.py
centrifuge_profile_standardisation_service
¶Provide a standardisation service for centrifuge profiles.
logger = logging.getLogger(__name__)
module-attribute
¶CentrifugeProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for centrifuge profiles.
Source code in src/taxpasta/infrastructure/application/centrifuge/centrifuge_profile_standardisation_service.py
transform(profile: DataFrame[CentrifugeProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given centrifuge profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[CentrifugeProfile]
|
A taxonomic profile generated by centrifuge. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/centrifuge/centrifuge_profile_standardisation_service.py
diamond
¶
Classes¶
Modules¶
diamond_profile
¶Provide a description of the diamond profile format.
DiamondProfile
¶
Bases: BaseDataFrameModel
Define the expected diamond profile format.
Source code in src/taxpasta/infrastructure/application/diamond/diamond_profile.py
diamond_profile_reader
¶Provide a reader for diamond profiles.
DiamondProfileReader
¶
Bases: ProfileReader
Define a reader for Diamond profiles.
Source code in src/taxpasta/infrastructure/application/diamond/diamond_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[DiamondProfile]
classmethod
¶Read a diamond taxonomic profile from a file.
Source code in src/taxpasta/infrastructure/application/diamond/diamond_profile_reader.py
diamond_profile_standardisation_service
¶Provide a standardisation service for diamond profiles.
DiamondProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for diamond profiles.
Source code in src/taxpasta/infrastructure/application/diamond/diamond_profile_standardisation_service.py
transform(profile: DataFrame[DiamondProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given diamond profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[DiamondProfile]
|
A taxonomic profile generated by diamond. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/diamond/diamond_profile_standardisation_service.py
ganon
¶
Classes¶
Modules¶
ganon_profile
¶Provide a description of the ganon profile format.
GANON_PERCENT_TOLERANCE = 1.0
module-attribute
¶GANON_PERCENT_TOTAL = 100.0
module-attribute
¶GanonProfile
¶
Bases: BaseDataFrameModel
Define the expected ganon profile format.
Source code in src/taxpasta/infrastructure/application/ganon/ganon_profile.py
lineage: Series[str] = pa.Field()
class-attribute
instance-attribute
¶name: Series[str] = pa.Field()
class-attribute
instance-attribute
¶number_children: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶number_cumulative: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶number_shared: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶number_unique: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶percent_cumulative: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
instance-attribute
¶rank: Series[str] = pa.Field()
class-attribute
instance-attribute
¶target: Series[str] = pa.Field()
class-attribute
instance-attribute
¶check_compositionality(profile: pd.DataFrame) -> bool
¶Check that the percent of 'unclassified' and 'root' add up to a hundred.
Source code in src/taxpasta/infrastructure/application/ganon/ganon_profile.py
ganon_profile_reader
¶Provide a reader for ganon profiles.
GanonProfileReader
¶
Bases: ProfileReader
Define a reader for ganon profiles.
Source code in src/taxpasta/infrastructure/application/ganon/ganon_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[GanonProfile]
classmethod
¶Read a ganon taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by ganon. |
required |
Returns:
Type | Description |
---|---|
DataFrame[GanonProfile]
|
A data frame representation of the ganon profile. |
Source code in src/taxpasta/infrastructure/application/ganon/ganon_profile_reader.py
ganon_profile_standardisation_service
¶Provide a standardisation service for ganon profiles.
logger = logging.getLogger(__name__)
module-attribute
¶GanonProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for ganon profiles.
Source code in src/taxpasta/infrastructure/application/ganon/ganon_profile_standardisation_service.py
transform(profile: DataFrame[GanonProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given ganon profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[GanonProfile]
|
A taxonomic profile generated by ganon. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/ganon/ganon_profile_standardisation_service.py
kaiju
¶
Classes¶
Modules¶
kaiju_profile
¶Provide a description of the kaiju profile format.
KAIJU_PERCENT_TOLERANCE = 1.0
module-attribute
¶KAIJU_PERCENT_TOTAL = 100.0
module-attribute
¶KaijuProfile
¶
Bases: BaseDataFrameModel
Define the expected kaiju profile format.
Source code in src/taxpasta/infrastructure/application/kaiju/kaiju_profile.py
file: Series[str] = pa.Field()
class-attribute
instance-attribute
¶percent: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
instance-attribute
¶reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶taxon_id: Series[pd.Int64Dtype] = pa.Field(nullable=True)
class-attribute
instance-attribute
¶taxon_name: Series[str] = pa.Field()
class-attribute
instance-attribute
¶check_compositionality(percent: Series[float]) -> bool
¶Check that the percentages add up to a hundred.
Source code in src/taxpasta/infrastructure/application/kaiju/kaiju_profile.py
check_unique_filename(file_col: Series[str]) -> bool
¶Check that Kaiju filename is unique.
kaiju_profile_reader
¶Provide a reader for kaiju profiles.
KaijuProfileReader
¶
Bases: ProfileReader
Define a reader for kaiju profiles.
Source code in src/taxpasta/infrastructure/application/kaiju/kaiju_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[KaijuProfile]
classmethod
¶Read a kaiju taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by kaiju. |
required |
Returns:
Type | Description |
---|---|
DataFrame[KaijuProfile]
|
A data frame representation of the kaiju profile. |
Source code in src/taxpasta/infrastructure/application/kaiju/kaiju_profile_reader.py
kaiju_profile_standardisation_service
¶Provide a standardisation service for kaiju profiles.
KaijuProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for kaiju profiles.
Source code in src/taxpasta/infrastructure/application/kaiju/kaiju_profile_standardisation_service.py
transform(profile: DataFrame[KaijuProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given kaiju profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[KaijuProfile]
|
A taxonomic profile generated by kaiju. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/kaiju/kaiju_profile_standardisation_service.py
kmcp
¶
Classes¶
Modules¶
kmcp_profile
¶Provide a description of the KMCP profile format.
KMCP_PERCENT_TOLERANCE = 1.0
module-attribute
¶KMCP_PERCENT_TOTAL = 100.0
module-attribute
¶KMCPProfile
¶
Bases: BaseDataFrameModel
Define the expected KMCP profile format.
Source code in src/taxpasta/infrastructure/application/kmcp/kmcp_profile.py
chunks_fraction: Series[float] = pa.Field(ge=0.0, le=1.0, alias='chunksFrac')
class-attribute
instance-attribute
¶chunks_relative_depth: Series[str] = pa.Field(alias='chunksRelDepth')
class-attribute
instance-attribute
¶chunks_relative_depth_std: Series[float] = pa.Field(ge=0.0, nullable=True, alias='chunksRelDepthStd')
class-attribute
instance-attribute
¶coverage: Series[float] = pa.Field(ge=0.0, nullable=True)
class-attribute
instance-attribute
¶high_confidence_unique_reads: Series[int] = pa.Field(ge=0, alias='hicureads')
class-attribute
instance-attribute
¶percentage: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
instance-attribute
¶rank: Series[str] = pa.Field(nullable=True)
class-attribute
instance-attribute
¶reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶reference: Series[str] = pa.Field(alias='ref')
class-attribute
instance-attribute
¶reference_name: Series[str] = pa.Field(nullable=True, alias='refname')
class-attribute
instance-attribute
¶reference_size: Series[int] = pa.Field(ge=0, alias='refsize')
class-attribute
instance-attribute
¶score: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
instance-attribute
¶taxid: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶taxonomic_name: Series[str] = pa.Field(nullable=True, alias='taxname')
class-attribute
instance-attribute
¶taxonomic_path: Series[str] = pa.Field(nullable=True, alias='taxpath')
class-attribute
instance-attribute
¶taxonomic_path_lineage: Series[str] = pa.Field(nullable=True, alias='taxpathsn')
class-attribute
instance-attribute
¶unique_reads: Series[int] = pa.Field(ge=0, alias='ureads')
class-attribute
instance-attribute
¶check_compositionality(percentage: Series[float]) -> bool
¶Check that the percentages add up to a hundred.
Source code in src/taxpasta/infrastructure/application/kmcp/kmcp_profile.py
kmcp_profile_reader
¶Provide a reader for KMCP profiles.
KMCPProfileReader
¶
Bases: ProfileReader
Define a reader for KMCP profiles.
Source code in src/taxpasta/infrastructure/application/kmcp/kmcp_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[KMCPProfile]
classmethod
¶Read a KMCP taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by KMCP. |
required |
Returns:
Type | Description |
---|---|
DataFrame[KMCPProfile]
|
A data frame representation of the KMCP profile. |
Source code in src/taxpasta/infrastructure/application/kmcp/kmcp_profile_reader.py
kmcp_profile_standardisation_service
¶Provide a standardisation service for KMCP profiles.
logger = logging.getLogger(__name__)
module-attribute
¶KMCPProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for KMCP profiles.
Source code in src/taxpasta/infrastructure/application/kmcp/kmcp_profile_standardisation_service.py
transform(profile: DataFrame[KMCPProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given KMCP profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[KMCPProfile]
|
A taxonomic profile generated by KMCP. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/kmcp/kmcp_profile_standardisation_service.py
kraken2
¶
Classes¶
Modules¶
kraken2_profile
¶Provide a description of the kraken2 profile format.
KRAKEN2_PERCENT_TOLERANCE = 1.0
module-attribute
¶KRAKEN2_PERCENT_TOTAL = 100.0
module-attribute
¶Kraken2Profile
¶
Bases: BaseDataFrameModel
Define the expected kraken2 profile format.
Source code in src/taxpasta/infrastructure/application/kraken2/kraken2_profile.py
clade_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶direct_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶distinct_minimizers: Optional[Series[int]] = pa.Field(ge=0)
class-attribute
instance-attribute
¶name: Series[str] = pa.Field()
class-attribute
instance-attribute
¶num_minimizers: Optional[Series[int]] = pa.Field(ge=0)
class-attribute
instance-attribute
¶percent: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
instance-attribute
¶taxonomy_id: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶taxonomy_lvl: Series[str] = pa.Field()
class-attribute
instance-attribute
¶check_compositionality(profile: pd.DataFrame) -> bool
¶Check that the percent of 'unclassified' and 'root' add up to a hundred.
Source code in src/taxpasta/infrastructure/application/kraken2/kraken2_profile.py
kraken2_profile_reader
¶Provide a reader for kraken2 profiles.
Kraken2ProfileReader
¶
Bases: ProfileReader
Define a reader for kraken2 profiles.
Source code in src/taxpasta/infrastructure/application/kraken2/kraken2_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[Kraken2Profile]
classmethod
¶Read a kraken2 taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by kraken2. |
required |
Returns:
Type | Description |
---|---|
DataFrame[Kraken2Profile]
|
A data frame representation of the kraken2 profile. |
Raises:
Type | Description |
---|---|
ValueError
|
In case the table does not contain exactly six or eight columns. |
Source code in src/taxpasta/infrastructure/application/kraken2/kraken2_profile_reader.py
kraken2_profile_standardisation_service
¶Provide a standardisation service for kraken2 profiles.
Kraken2ProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for kraken2 profiles.
Source code in src/taxpasta/infrastructure/application/kraken2/kraken2_profile_standardisation_service.py
transform(profile: DataFrame[Kraken2Profile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given kraken2 profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[Kraken2Profile]
|
A taxonomic profile generated by kraken2. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/kraken2/kraken2_profile_standardisation_service.py
krakenuniq
¶
Classes¶
Modules¶
krakenuniq_profile
¶Provide a description of the KrakenUniq profile format.
KrakenUniqProfile
¶
Bases: BaseDataFrameModel
Define the expected KrakenUniq profile format.
Source code in src/taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile.py
coverage: Series[float] = pa.Field(ge=0.0, nullable=True, alias='cov')
class-attribute
instance-attribute
¶duplicates: Series[float] = pa.Field(ge=0.0, alias='dup')
class-attribute
instance-attribute
¶kmers: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶percent: Series[float] = pa.Field(ge=0.0, le=100.0, alias='%')
class-attribute
instance-attribute
¶rank: Series[str] = pa.Field()
class-attribute
instance-attribute
¶reads: Series[int] = pa.Field(ge=0)
class-attribute
instance-attribute
¶tax_id: Series[int] = pa.Field(alias='taxID', ge=0)
class-attribute
instance-attribute
¶tax_name: Series[str] = pa.Field(alias='taxName')
class-attribute
instance-attribute
¶tax_reads: Series[int] = pa.Field(ge=0, alias='taxReads')
class-attribute
instance-attribute
¶krakenuniq_profile_reader
¶Provide a reader for KrakenUniq profiles.
KrakenUniqProfileReader
¶
Bases: ProfileReader
Define a reader for KrakenUniq profiles.
Source code in src/taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[KrakenUniqProfile]
classmethod
¶Read a krakenUniq taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by KrakenUniq. |
required |
Returns:
Type | Description |
---|---|
DataFrame[KrakenUniqProfile]
|
A data frame representation of the KrakenUniq profile. |
Source code in src/taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile_reader.py
krakenuniq_profile_standardisation_service
¶Provide a standardisation service for KrakenUniq profiles.
KrakenUniqProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for krakenUniq profiles.
Source code in src/taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile_standardisation_service.py
transform(profile: DataFrame[KrakenUniqProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given krakenUniq profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[KrakenUniqProfile]
|
A taxonomic profile generated by KrakenUniq. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile_standardisation_service.py
megan6
¶
Classes¶
Modules¶
megan6_profile
¶Provide a description of the MEGAN6 rma2info profile format.
Megan6Profile
¶
Bases: BaseDataFrameModel
Define the expected MEGAN6 rma2info profile format.
Source code in src/taxpasta/infrastructure/application/megan6/megan6_profile.py
megan6_profile_reader
¶Provide a reader for megan6 profiles.
Megan6ProfileReader
¶
Bases: ProfileReader
Define a reader for MEGAN6 rma2info profiles.
Source code in src/taxpasta/infrastructure/application/megan6/megan6_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[Megan6Profile]
classmethod
¶Read a MEGAN6 rma2info taxonomic profile from a file.
Source code in src/taxpasta/infrastructure/application/megan6/megan6_profile_reader.py
megan6_profile_standardisation_service
¶Provide a standardisation service for megan6 profiles.
Megan6ProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for megan6 profiles.
Source code in src/taxpasta/infrastructure/application/megan6/megan6_profile_standardisation_service.py
transform(profile: DataFrame[Megan6Profile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given MEGAN6 rma2info profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[Megan6Profile]
|
A taxonomic profile generated by MEGAN6 rma2info. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/megan6/megan6_profile_standardisation_service.py
metaphlan
¶
Classes¶
Modules¶
metaphlan_profile
¶Provide a description of the metaphlan profile format.
METAPHLAN_PERCENT_TOLERANCE = 1.0
module-attribute
¶METAPHLAN_PERCENT_TOTAL = 100.0
module-attribute
¶MetaphlanProfile
¶
Bases: BaseDataFrameModel
Define the expected metaphlan profile format.
Source code in src/taxpasta/infrastructure/application/metaphlan/metaphlan_profile.py
additional_species: Optional[Series[str]] = pa.Field(nullable=True)
class-attribute
instance-attribute
¶clade_name: Series[str] = pa.Field()
class-attribute
instance-attribute
¶ncbi_tax_id: Series[str] = pa.Field(alias='NCBI_tax_id')
class-attribute
instance-attribute
¶relative_abundance: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
instance-attribute
¶check_compositionality(profile: pd.DataFrame) -> bool
¶Check that the percentages per rank add up to a hundred.
Source code in src/taxpasta/infrastructure/application/metaphlan/metaphlan_profile.py
metaphlan_profile_reader
¶Provide a reader for metaphlan profiles.
MetaphlanProfileReader
¶
Bases: ProfileReader
Define a reader for Metaphlan profiles.
Source code in src/taxpasta/infrastructure/application/metaphlan/metaphlan_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[MetaphlanProfile]
classmethod
¶Read a metaphlan taxonomic profile from a file.
Source code in src/taxpasta/infrastructure/application/metaphlan/metaphlan_profile_reader.py
metaphlan_profile_standardisation_service
¶Provide a standardisation service for metaphlan profiles.
logger = logging.getLogger(__name__)
module-attribute
¶MetaphlanProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for metaphlan profiles.
Source code in src/taxpasta/infrastructure/application/metaphlan/metaphlan_profile_standardisation_service.py
LARGE_INTEGER = 1000000
class-attribute
instance-attribute
¶transform(profile: DataFrame[MetaphlanProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given metaphlan profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[MetaphlanProfile]
|
A taxonomic profile generated by metaphlan. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/metaphlan/metaphlan_profile_standardisation_service.py
motus
¶
Classes¶
Modules¶
motus_profile
¶Provide a description of the mOTUs profile format.
MotusProfile
¶
Bases: BaseDataFrameModel
Define the expected mOTUs profile format.
Source code in src/taxpasta/infrastructure/application/motus/motus_profile.py
motus_profile_reader
¶Provide a reader for motus profiles.
MotusProfileReader
¶
Bases: ProfileReader
Define a reader for mOTUS profiles.
Source code in src/taxpasta/infrastructure/application/motus/motus_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[MotusProfile]
classmethod
¶Read a mOTUs taxonomic profile from a file.
Source code in src/taxpasta/infrastructure/application/motus/motus_profile_reader.py
motus_profile_standardisation_service
¶Provide a standardisation service for mOTUs profiles.
MotusProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for mOTUs profiles.
Source code in src/taxpasta/infrastructure/application/motus/motus_profile_standardisation_service.py
transform(profile: DataFrame[MotusProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given mOTUs profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[MotusProfile]
|
A taxonomic profile generated by mOTUs. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in src/taxpasta/infrastructure/application/motus/motus_profile_standardisation_service.py
sample_sheet
¶
Provide a description of samples and profile locations.
Classes¶
SampleSheet
¶
Bases: DataFrameModel
Define a description of samples and profile locations.
Source code in src/taxpasta/infrastructure/application/sample_sheet.py
profile: Series[str] = pa.Field()
class-attribute
instance-attribute
¶sample: Series[str] = pa.Field()
class-attribute
instance-attribute
¶Config
¶Configure the schema model.
Source code in src/taxpasta/infrastructure/application/sample_sheet.py
check_number_samples(table: DataFrame) -> bool
classmethod
¶Check that there are at least two samples.
Source code in src/taxpasta/infrastructure/application/sample_sheet.py
check_profile_presence(profile: Series[str]) -> Series[bool]
classmethod
¶Check that every profile is present at the specified location.
Source code in src/taxpasta/infrastructure/application/sample_sheet.py
standard_profile_file_format
¶
Provide a service for supported tabular file formats.
Classes¶
StandardProfileFileFormat
¶
Bases: str
, DependencyCheckMixin
, Enum
Define the supported standardized profile file formats.
Source code in src/taxpasta/infrastructure/application/standard_profile_file_format.py
CSV = 'CSV'
class-attribute
instance-attribute
¶ODS = 'ODS'
class-attribute
instance-attribute
¶TSV = 'TSV'
class-attribute
instance-attribute
¶XLSX = 'XLSX'
class-attribute
instance-attribute
¶arrow = 'arrow'
class-attribute
instance-attribute
¶parquet = 'parquet'
class-attribute
instance-attribute
¶
standard_profile_writer
¶
Modules¶
arrow_standard_profile_writer
¶Provide an arrow writer.
ArrowStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the arrow writer.
Source code in src/taxpasta/infrastructure/application/standard_profile_writer/arrow_standard_profile_writer.py
csv_standard_profile_writer
¶Provide a CSV writer.
CSVStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the CSV writer.
Source code in src/taxpasta/infrastructure/application/standard_profile_writer/csv_standard_profile_writer.py
ods_standard_profile_writer
¶Provide an ODS writer.
ODSStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the ODS writer.
Source code in src/taxpasta/infrastructure/application/standard_profile_writer/ods_standard_profile_writer.py
parquet_standard_profile_writer
¶Provide an parquet writer.
ParquetStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the parquet writer.
Source code in src/taxpasta/infrastructure/application/standard_profile_writer/parquet_standard_profile_writer.py
tsv_standard_profile_writer
¶Provide an TSV writer.
TSVStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the TSV writer.
Source code in src/taxpasta/infrastructure/application/standard_profile_writer/tsv_standard_profile_writer.py
xlsx_standard_profile_writer
¶Provide an XLSX writer.
XLSXStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the XLSX writer.
Source code in src/taxpasta/infrastructure/application/standard_profile_writer/xlsx_standard_profile_writer.py
supported_profiler
¶
Provide an enumeration of supported taxonomic profilers.
Classes¶
SupportedProfiler
¶
Bases: str
, Enum
Define supported taxonomic profilers.
Source code in src/taxpasta/infrastructure/application/supported_profiler.py
bracken = 'bracken'
class-attribute
instance-attribute
¶centrifuge = 'centrifuge'
class-attribute
instance-attribute
¶diamond = 'diamond'
class-attribute
instance-attribute
¶ganon = 'ganon'
class-attribute
instance-attribute
¶kaiju = 'kaiju'
class-attribute
instance-attribute
¶kmcp = 'kmcp'
class-attribute
instance-attribute
¶kraken2 = 'kraken2'
class-attribute
instance-attribute
¶krakenuniq = 'krakenuniq'
class-attribute
instance-attribute
¶megan6 = 'megan6'
class-attribute
instance-attribute
¶metaphlan = 'metaphlan'
class-attribute
instance-attribute
¶motus = 'motus'
class-attribute
instance-attribute
¶
table_reader
¶
Modules¶
arrow_table_reader
¶Provide an arrow reader.
csv_table_reader
¶ods_table_reader
¶Provide an ODS reader.
parquet_table_reader
¶Provide an parquet reader.
ParquetTableReader
¶
Bases: TableReader
Define the parquet reader.
Source code in src/taxpasta/infrastructure/application/table_reader/parquet_table_reader.py
tsv_table_reader
¶xlsx_table_reader
¶Provide an XLSX reader.
table_reader_file_format
¶
Provide a service for supported tabular file formats.
Classes¶
TableReaderFileFormat
¶
Bases: str
, DependencyCheckMixin
, Enum
Define the supported tabular file formats.
Source code in src/taxpasta/infrastructure/application/table_reader_file_format.py
CSV = 'CSV'
class-attribute
instance-attribute
¶ODS = 'ODS'
class-attribute
instance-attribute
¶TSV = 'TSV'
class-attribute
instance-attribute
¶XLSX = 'XLSX'
class-attribute
instance-attribute
¶arrow = 'arrow'
class-attribute
instance-attribute
¶parquet = 'parquet'
class-attribute
instance-attribute
¶
tidy_observation_table_file_format
¶
Provide a service for supported tabular file formats.
Classes¶
TidyObservationTableFileFormat
¶
Bases: str
, DependencyCheckMixin
, Enum
Define the supported tabular file formats.
Source code in src/taxpasta/infrastructure/application/tidy_observation_table_file_format.py
CSV = 'CSV'
class-attribute
instance-attribute
¶ODS = 'ODS'
class-attribute
instance-attribute
¶TSV = 'TSV'
class-attribute
instance-attribute
¶XLSX = 'XLSX'
class-attribute
instance-attribute
¶arrow = 'arrow'
class-attribute
instance-attribute
¶parquet = 'parquet'
class-attribute
instance-attribute
¶
tidy_observation_table_writer
¶
Modules¶
arrow_table_writer
¶Provide an arrow writer.
ArrowTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the arrow writer.
Source code in src/taxpasta/infrastructure/application/tidy_observation_table_writer/arrow_table_writer.py
csv_table_writer
¶Provide a CSV writer.
CSVTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the CSV writer.
Source code in src/taxpasta/infrastructure/application/tidy_observation_table_writer/csv_table_writer.py
ods_table_writer
¶Provide an ODS writer.
ODSTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the ODS writer.
Source code in src/taxpasta/infrastructure/application/tidy_observation_table_writer/ods_table_writer.py
parquet_table_writer
¶Provide an parquet writer.
ParquetTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the parquet writer.
Source code in src/taxpasta/infrastructure/application/tidy_observation_table_writer/parquet_table_writer.py
tsv_table_writer
¶Provide an TSV writer.
TSVTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the TSV writer.
Source code in src/taxpasta/infrastructure/application/tidy_observation_table_writer/tsv_table_writer.py
xlsx_table_writer
¶Provide an XLSX writer.
XLSXTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the XLSX writer.
Source code in src/taxpasta/infrastructure/application/tidy_observation_table_writer/xlsx_table_writer.py
wide_observation_table_file_format
¶
Provide a service for supported container file formats.
Classes¶
WideObservationTableFileFormat
¶
Bases: str
, DependencyCheckMixin
, Enum
Define the supported container file formats.
Source code in src/taxpasta/infrastructure/application/wide_observation_table_file_format.py
BIOM = 'BIOM'
class-attribute
instance-attribute
¶CSV = 'CSV'
class-attribute
instance-attribute
¶ODS = 'ODS'
class-attribute
instance-attribute
¶TSV = 'TSV'
class-attribute
instance-attribute
¶XLSX = 'XLSX'
class-attribute
instance-attribute
¶arrow = 'arrow'
class-attribute
instance-attribute
¶parquet = 'parquet'
class-attribute
instance-attribute
¶
wide_observation_table_writer
¶
Modules¶
arrow_wide_observation_table_writer
¶Provide an arrow writer.
ArrowWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the arrow writer.
Source code in src/taxpasta/infrastructure/application/wide_observation_table_writer/arrow_wide_observation_table_writer.py
biom_wide_observation_table_writer
¶Provide a Biological Observation Matrix (BIOM) writer.
BIOMWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the Biological Observation Matrix (BIOM) writer.
Source code in src/taxpasta/infrastructure/application/wide_observation_table_writer/biom_wide_observation_table_writer.py
write(matrix: DataFrame[WideObservationTable], target: Filepath, taxonomy: Optional[TaxonomyService] = None, generated_by: str = 'taxpasta', **kwargs) -> None
classmethod
¶Write the given data to the given buffer or file.
Source code in src/taxpasta/infrastructure/application/wide_observation_table_writer/biom_wide_observation_table_writer.py
csv_wide_observation_table_writer
¶Provide a CSV writer.
CSVWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the CSV writer.
Source code in src/taxpasta/infrastructure/application/wide_observation_table_writer/csv_wide_observation_table_writer.py
ods_wide_observation_table_writer
¶Provide an ODS writer.
ODSWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the ODS writer.
Source code in src/taxpasta/infrastructure/application/wide_observation_table_writer/ods_wide_observation_table_writer.py
parquet_wide_observation_table_writer
¶Provide an parquet writer.
ParquetWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the parquet writer.
Source code in src/taxpasta/infrastructure/application/wide_observation_table_writer/parquet_wide_observation_table_writer.py
tsv_wide_observation_table_writer
¶Provide an TSV writer.
TSVWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the TSV writer.
Source code in src/taxpasta/infrastructure/application/wide_observation_table_writer/tsv_wide_observation_table_writer.py
xlsx_wide_observation_table_writer
¶Provide an XLSX writer.
XLSXWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the XLSX writer.
Source code in src/taxpasta/infrastructure/application/wide_observation_table_writer/xlsx_wide_observation_table_writer.py
cli
¶
Attributes¶
Modules¶
merge
¶
Add the merge
command to the taxpasta CLI.
Attributes¶
logger = logging.getLogger(__name__)
module-attribute
¶Classes¶
Functions¶
merge(profiles: Optional[List[Path]] = typer.Argument(None, metavar='[PROFILE1 PROFILE2 [...]]', help='Two or more files containing taxonomic profiles. Required unless there is a sample sheet. Filenames will be parsed as sample names.', show_default=False), profiler: SupportedProfiler = typer.Option(..., '--profiler', '-p', case_sensitive=False, help='The taxonomic profiler used. All provided profiles must come from the same tool!', show_default=False), sample_sheet: Optional[Path] = typer.Option(None, '--samplesheet', '-s', help="A table with a header and two columns: the first column named 'sample' which can be any string and the second column named 'profile' which must be a file path to an actual taxonomic abundance profile. If this option is provided, any arguments are ignored.", exists=True, file_okay=True, dir_okay=False, readable=True), samplesheet_format: Optional[TableReaderFileFormat] = typer.Option(None, case_sensitive=False, help='The file format of the sample sheet. Depending on the choice, additional package dependencies may apply. Will be parsed from the sample sheet file name but can be set explicitly.'), output: Path = typer.Option(..., '--output', '-o', help='The desired output file. By default, the file extension will be used to determine the output format, but when setting the format explicitly using the --output-format option, automatic detection is disabled.', show_default=False), output_format: Optional[WideObservationTableFileFormat] = typer.Option(None, case_sensitive=False, help='The desired output format. Depending on the choice, additional package dependencies may apply. By default it will be parsed from the output file name but it can be set explicitly and will then disable the automatic detection.'), wide_format: bool = typer.Option(True, '--wide/--long', help='Output merged abundance data in either wide or (tidy) long format. Ignored when the desired output format is BIOM.'), summarise_at: Optional[str] = typer.Option(None, '--summarise-at', '--summarize-at', help="Summarise abundance profiles at higher taxonomic rank. The provided option must match a rank in the taxonomy exactly. This is akin to the clade assigned reads provided by, for example, kraken2, where the abundances of a whole taxonomic branch are assigned to a taxon at the desired rank. Please note that abundances above the selected rank are simply ignored. No attempt is made to redistribute those down to the desired rank. Some tools, like Bracken, were designed for this purpose but it doesn't seem like a problem we can generally solve here."), taxonomy: Optional[Path] = typer.Option(None, help='The path to a directory containing taxdump files. At least nodes.dmp and names.dmp are required. A merged.dmp file is optional.'), add_name: bool = typer.Option(False, '--add-name', help='Add the taxon name to the output.'), add_rank: bool = typer.Option(False, '--add-rank', help='Add the taxon rank to the output.'), add_lineage: bool = typer.Option(False, '--add-lineage', help="Add the taxon's entire lineage to the output. These are taxon names separated by semi-colons."), add_id_lineage: bool = typer.Option(False, '--add-id-lineage', help="Add the taxon's entire lineage to the output. These are taxon identifiers separated by semi-colons."), add_rank_lineage: bool = typer.Option(False, '--add-rank-lineage', help="Add the taxon's entire rank lineage to the output. These are taxon ranks separated by semi-colons."), ignore_errors: bool = typer.Option(False, '--ignore-errors', help='Ignore any metagenomic profiles with errors. Please note that there must be at least two profiles without errors to merge.')) -> None
¶Standardise and merge two or more taxonomic profiles.
Source code in src/taxpasta/infrastructure/cli/merge.py
201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 |
|
read_sample_sheet(sample_sheet: Path, sample_format: TableReaderFileFormat) -> DataFrame[SampleSheet]
¶Extract and validate the sample sheet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample_sheet |
Path
|
Path to the sample sheet. |
required |
sample_format |
TableReaderFileFormat
|
The determined file format. |
required |
Returns:
Type | Description |
---|---|
DataFrame[SampleSheet]
|
A pandas data frame in the form of a sample sheet. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when there is a schema error. |
Source code in src/taxpasta/infrastructure/cli/merge.py
validate_observation_matrix_format(output: Path, output_format: Optional[str]) -> WideObservationTableFileFormat
¶Detect the output format if it isn't given.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output |
Path
|
Path for the output. |
required |
output_format |
Optional[str]
|
The selected file format if any. |
required |
Returns:
Type | Description |
---|---|
WideObservationTableFileFormat
|
The validated output file format. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when the format cannot be guessed or dependencies are missing. |
Source code in src/taxpasta/infrastructure/cli/merge.py
validate_sample_format(sample_sheet: Path, sample_format: Optional[TableReaderFileFormat]) -> TableReaderFileFormat
¶Detect the sample sheet format if it isn't given.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample_sheet |
Path
|
Path to the sample sheet. |
required |
sample_format |
Optional[TableReaderFileFormat]
|
The selected file format if any. |
required |
Returns:
Type | Description |
---|---|
TableReaderFileFormat
|
The validated sample sheet format. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when the format cannot be guessed or dependencies are missing. |
Source code in src/taxpasta/infrastructure/cli/merge.py
validate_tidy_observation_table_format(output: Path, output_format: Optional[str]) -> TidyObservationTableFileFormat
¶Detect the output format if it isn't given.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output |
Path
|
Path for the output. |
required |
output_format |
Optional[str]
|
The selected file format if any. |
required |
Returns:
Type | Description |
---|---|
TidyObservationTableFileFormat
|
The validated output file format. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when the format cannot be guessed or dependencies are missing. |
Source code in src/taxpasta/infrastructure/cli/merge.py
standardise
¶
Add the standardize
command to the taxpasta CLI.
Attributes¶
logger = logging.getLogger(__name__)
module-attribute
¶Classes¶
Functions¶
standardise(profile: Path = typer.Argument(..., metavar='PROFILE', help='A file containing a taxonomic profile.', show_default=False), profiler: SupportedProfiler = typer.Option(..., '--profiler', '-p', case_sensitive=False, help='The taxonomic profiler used.', show_default=False), output: Path = typer.Option(..., '--output', '-o', help='The desired output file. By default, the file extension will be used to determine the output format, but when setting the format explicitly using the --output-format option, automatic detection is disabled.', show_default=False), output_format: Optional[StandardProfileFileFormat] = typer.Option(None, case_sensitive=False, help='The desired output format. Depending on the choice, additional package dependencies may apply. By default it will be parsed from the output file name but it can be set explicitly and will then disable the automatic detection.'), summarise_at: Optional[str] = typer.Option(None, '--summarise-at', '--summarize-at', help="Summarise abundance profiles at higher taxonomic rank. The provided option must match a rank in the taxonomy exactly. This is akin to the clade assigned reads provided by, for example, kraken2, where the abundances of a whole taxonomic branch are assigned to a taxon at the desired rank. Please note that abundances above the selected rank are simply ignored. No attempt is made to redistribute those down to the desired rank. Some tools, like Bracken, were designed for this purpose but it doesn't seem like a problem we can generally solve here."), taxonomy: Optional[Path] = typer.Option(None, help='The path to a directory containing taxdump files. At least nodes.dmp and names.dmp are required. A merged.dmp file is optional.'), add_name: bool = typer.Option(False, '--add-name', help='Add the taxon name to the output.'), add_rank: bool = typer.Option(False, '--add-rank', help='Add the taxon rank to the output.'), add_lineage: bool = typer.Option(False, '--add-lineage', help="Add the taxon's entire lineage to the output. These are taxon names separated by semi-colons."), add_id_lineage: bool = typer.Option(False, '--add-id-lineage', help="Add the taxon's entire lineage to the output. These are taxon identifiers separated by semi-colons."), add_rank_lineage: bool = typer.Option(False, '--add-rank-lineage', help="Add the taxon's entire rank lineage to the output. These are taxon ranks separated by semi-colons.")) -> None
¶Standardise a taxonomic profile.
Source code in src/taxpasta/infrastructure/cli/standardise.py
84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 |
|
validate_output_format(output: Path, output_format: Optional[str]) -> StandardProfileFileFormat
¶Detect the output format if it isn't given.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output |
Path
|
Path for the output. |
required |
output_format |
Optional[str]
|
The selected file format if any. |
required |
Returns:
Type | Description |
---|---|
StandardProfileFileFormat
|
The validated output file format. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when the format cannot be guessed or dependencies are missing. |
Source code in src/taxpasta/infrastructure/cli/standardise.py
taxpasta
¶
Provide a command-line interface (CLI) for taxpasta functionality.
Attributes¶
app = typer.Typer(help='TAXonomic Profile Aggregation and STAndardisation', context_settings={'help_option_names': ['-h', '--help']})
module-attribute
¶logger = logging.getLogger('taxpasta')
module-attribute
¶Classes¶
LogLevel
¶
Bases: str
, Enum
Define the choices for the log level option.
Source code in src/taxpasta/infrastructure/cli/taxpasta.py
Functions¶
initialize(context: typer.Context, version: Optional[bool] = typer.Option(None, '--version', callback=version_callback, is_eager=True, help='Print only the current tool version and exit.'), log_level: LogLevel = typer.Option(LogLevel.INFO.name, '--log-level', '-l', case_sensitive=False, help='Set the desired log level.'))
¶Initialize logging and rich printing if available.
Source code in src/taxpasta/infrastructure/cli/taxpasta.py
version_callback(is_set: bool) -> None
¶Print the tool version if desired.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
is_set |
bool
|
Whether the version was requested as a command line option. |
required |
Raises:
Type | Description |
---|---|
Exit
|
With default code 0 to signal normal program end. |
Source code in src/taxpasta/infrastructure/cli/taxpasta.py
Modules¶
domain
¶
Provide concrete implementations of domain models and services.
Modules¶
service
¶
Provide concrete implementations of domain services.
Modules¶
taxopy_taxonomy_service
¶Provide a taxonomy service based on taxopy.
logger = logging.getLogger(__name__)
module-attribute
¶TaxopyTaxonomyService
¶
Bases: TaxonomyService
Define the taxonomy service based on taxopy.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 |
|
__init__(*, tax_db: taxopy.TaxDb, **kwargs) -> None
¶Initialize a taxonomy service instance with a taxopy database.
add_identifier_lineage(table: DataFrame[ResultTable]) -> DataFrame[ResultTable]
¶Add a column for the taxon lineage as identifiers to the given table.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
add_name(table: DataFrame[ResultTable]) -> DataFrame[ResultTable]
¶Add a column for the taxon name to the given table.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
add_name_lineage(table: DataFrame[ResultTable]) -> DataFrame[ResultTable]
¶Add a column for the taxon lineage to the given table.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
add_rank(table: DataFrame[ResultTable]) -> DataFrame[ResultTable]
¶Add a column for the taxon rank to the given table.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
add_rank_lineage(table: DataFrame[ResultTable]) -> DataFrame[ResultTable]
¶Add a column for the taxon lineage as ranks to the given table.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
format_biom_taxonomy(table: DataFrame[ResultTable]) -> List[Dict[str, List[str]]]
¶Format the taxonomy as BIOM observation metadata.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
from_taxdump(source: Path) -> TaxopyTaxonomyService
classmethod
¶Create a service instance from a directory path containing taxdump info.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
get_taxon_identifier_lineage(taxonomy_id: int) -> Optional[List[int]]
¶Return the lineage of a given taxonomy identifier as identifiers.
Only identifiers with associated ranks are included.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
get_taxon_name(taxonomy_id: int) -> Optional[str]
¶Return the name of a given taxonomy identifier.
get_taxon_name_lineage(taxonomy_id: int) -> Optional[List[str]]
¶Return the lineage of a given taxonomy identifier as names.
Only names with associated ranks are included.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
get_taxon_rank(taxonomy_id: int) -> Optional[str]
¶Return the rank of a given taxonomy identifier.
get_taxon_rank_lineage(taxonomy_id: int) -> Optional[List[str]]
¶Return the lineage of a given taxonomy identifier as ranks.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
summarise_at(profile: DataFrame[StandardProfile], rank: str) -> DataFrame[StandardProfile]
¶Summarise a standardised abundance profile at a higher taxonomic rank.
Source code in src/taxpasta/infrastructure/domain/service/taxopy_taxonomy_service.py
helpers
¶
Provide general helpers.
Classes¶
Functions¶
Modules¶
base_data_frame_model
¶
Provide a base data frame model for general checks and configuration.
Classes¶
BaseDataFrameModel
¶
Bases: DataFrameModel
Define the base data frame model for general checks and configuration.
Source code in src/taxpasta/infrastructure/helpers/base_data_frame_model.py
decorators
¶
Provide general decorators.
Functions¶
raise_parser_warnings(func: Callable) -> Callable
¶Decorate a function in order to raise parser warnings as value errors.