Modules¶
application
¶
Classes¶
Modules¶
application_service_registry
¶
Provide an application service registry.
Classes¶
ApplicationServiceRegistry
¶Define an application service registry.
Source code in taxpasta/infrastructure/application/application_service_registry.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 |
|
profile_reader(profiler: SupportedProfiler) -> Type[ProfileReader]
classmethod
¶Return a profile reader of the correct type.
Source code in taxpasta/infrastructure/application/application_service_registry.py
profile_standardisation_service(profiler: SupportedProfiler) -> Type[ProfileStandardisationService]
classmethod
¶Return a profile standardisation service of the correct type.
Source code in taxpasta/infrastructure/application/application_service_registry.py
standard_profile_writer(file_format: StandardProfileFileFormat) -> Type[StandardProfileWriter]
classmethod
¶Return a standard profile writer of the correct type.
Source code in taxpasta/infrastructure/application/application_service_registry.py
table_reader(file_format: TableReaderFileFormat) -> Type[TableReader]
classmethod
¶Return a table reader of the correct type.
Source code in taxpasta/infrastructure/application/application_service_registry.py
tidy_observation_table_writer(file_format: TidyObservationTableFileFormat) -> Type[TidyObservationTableWriter]
classmethod
¶Return a tidy table writer of the correct type.
Source code in taxpasta/infrastructure/application/application_service_registry.py
wide_observation_table_writer(file_format: WideObservationTableFileFormat) -> Type[WideObservationTableWriter]
classmethod
¶Return a writer for wide observation tables in the specified format.
Source code in taxpasta/infrastructure/application/application_service_registry.py
bracken
¶
Classes¶
Modules¶
bracken_profile
¶Provide a description of the Bracken profile format.
BrackenProfile
¶
Bases: pa.SchemaModel
Define the expected Bracken profile format.
Source code in taxpasta/infrastructure/application/bracken/bracken_profile.py
added_reads: Series[int] = pa.Field(ge=0)
class-attribute
¶fraction_total_reads: Series[float] = pa.Field(ge=0.0, le=1.0)
class-attribute
¶kraken_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
¶name: Series[str] = pa.Field()
class-attribute
¶new_est_reads: Series[int] = pa.Field(ge=0)
class-attribute
¶taxonomy_id: Series[int] = pa.Field(ge=0)
class-attribute
¶taxonomy_lvl: Series[pd.CategoricalDtype] = pa.Field()
class-attribute
¶Config
¶Configure the schema model.
Source code in taxpasta/infrastructure/application/bracken/bracken_profile.py
check_added_reads_consistency(profile: DataFrame) -> Series[bool]
classmethod
¶Check that Bracken added reads are consistent.
Source code in taxpasta/infrastructure/application/bracken/bracken_profile.py
check_compositionality(fraction_total_reads: Series[float]) -> bool
classmethod
¶Check that the fractions of reads add up to one.
Source code in taxpasta/infrastructure/application/bracken/bracken_profile.py
bracken_profile_reader
¶Provide a reader for Bracken profiles.
BrackenProfileReader
¶
Bases: ProfileReader
Define a reader for Bracken profiles.
Source code in taxpasta/infrastructure/application/bracken/bracken_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[BrackenProfile]
classmethod
¶Read a Bracken taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by Bracken. |
required |
Returns:
Type | Description |
---|---|
DataFrame[BrackenProfile]
|
A data frame representation of the Bracken profile. |
Source code in taxpasta/infrastructure/application/bracken/bracken_profile_reader.py
bracken_profile_standardisation_service
¶Provide a standardisation service for Bracken profiles.
BrackenProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for Bracken profiles.
Source code in taxpasta/infrastructure/application/bracken/bracken_profile_standardisation_service.py
transform(profile: DataFrame[BrackenProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given Bracken profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[BrackenProfile]
|
A taxonomic profile generated by Bracken. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Raises:
Type | Description |
---|---|
pandera.errors.SchemaErrors
|
If the given profile does not conform with the
|
Source code in taxpasta/infrastructure/application/bracken/bracken_profile_standardisation_service.py
centrifuge
¶
Classes¶
Modules¶
centrifuge_profile
¶Provide a description of the centrifuge profile format.
CentrifugeProfile
¶
Bases: pa.SchemaModel
Define the expected centrifuge profile format.
Source code in taxpasta/infrastructure/application/centrifuge/centrifuge_profile.py
clade_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
¶direct_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
¶name: Series[str] = pa.Field()
class-attribute
¶percent: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
¶taxonomy_id: Series[int] = pa.Field(ge=0)
class-attribute
¶taxonomy_level: Series[pd.CategoricalDtype] = pa.Field()
class-attribute
¶Config
¶Configure the schema model.
Source code in taxpasta/infrastructure/application/centrifuge/centrifuge_profile.py
check_compositionality(percent: Series[float]) -> bool
classmethod
¶Check that the percent of 'unclassified' and 'root' add up to a hundred.
Source code in taxpasta/infrastructure/application/centrifuge/centrifuge_profile.py
centrifuge_profile_reader
¶Provide a reader for Centrifuge profiles.
CentrifugeProfileReader
¶
Bases: ProfileReader
Define a reader for centrifuge profiles.
Source code in taxpasta/infrastructure/application/centrifuge/centrifuge_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[CentrifugeProfile]
classmethod
¶Read a centrifuge taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by centrifuge. |
required |
Returns:
Type | Description |
---|---|
DataFrame[CentrifugeProfile]
|
A data frame representation of the centrifuge profile. |
Source code in taxpasta/infrastructure/application/centrifuge/centrifuge_profile_reader.py
centrifuge_profile_standardisation_service
¶Provide a standardisation service for centrifuge profiles.
logger = logging.getLogger(__name__)
module-attribute
¶CentrifugeProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for centrifuge profiles.
Source code in taxpasta/infrastructure/application/centrifuge/centrifuge_profile_standardisation_service.py
transform(profile: DataFrame[CentrifugeProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given centrifuge profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[CentrifugeProfile]
|
A taxonomic profile generated by centrifuge. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in taxpasta/infrastructure/application/centrifuge/centrifuge_profile_standardisation_service.py
diamond
¶
Classes¶
Modules¶
diamond_profile
¶Provide a description of the diamond profile format.
DiamondProfile
¶
Bases: pa.SchemaModel
Define the expected diamond profile format.
Source code in taxpasta/infrastructure/application/diamond/diamond_profile.py
e_value: Series[float] = pa.Field(ge=0.0, le=1.0)
class-attribute
¶query_id: Series[str] = pa.Field()
class-attribute
¶taxonomy_id: Series[int] = pa.Field(ge=0)
class-attribute
¶Config
¶Configure the schema model.
Source code in taxpasta/infrastructure/application/diamond/diamond_profile.py
diamond_profile_reader
¶Provide a reader for diamond profiles.
DiamondProfileReader
¶
Bases: ProfileReader
Define a reader for Diamond profiles.
Source code in taxpasta/infrastructure/application/diamond/diamond_profile_reader.py
LARGE_INTEGER = int(10000000.0)
class-attribute
¶read(profile: BufferOrFilepath) -> DataFrame[DiamondProfile]
classmethod
¶Read a diamond taxonomic profile from a file.
Source code in taxpasta/infrastructure/application/diamond/diamond_profile_reader.py
diamond_profile_standardisation_service
¶Provide a standardisation service for diamond profiles.
DiamondProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for diamond profiles.
Source code in taxpasta/infrastructure/application/diamond/diamond_profile_standardisation_service.py
transform(profile: DataFrame[DiamondProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given diamond profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[DiamondProfile]
|
A taxonomic profile generated by diamond. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in taxpasta/infrastructure/application/diamond/diamond_profile_standardisation_service.py
kaiju
¶
Classes¶
Modules¶
kaiju_profile
¶Provide a description of the kaiju profile format.
KaijuProfile
¶
Bases: pa.SchemaModel
Define the expected kaiju profile format.
Source code in taxpasta/infrastructure/application/kaiju/kaiju_profile.py
file: Series[str] = pa.Field()
class-attribute
¶percent: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
¶reads: Series[int] = pa.Field(ge=0)
class-attribute
¶taxon_id: Series[str] = pa.Field(nullable=True)
class-attribute
¶taxon_name: Series[str] = pa.Field()
class-attribute
¶Config
¶Configure the schema model.
Source code in taxpasta/infrastructure/application/kaiju/kaiju_profile.py
check_compositionality(percent: Series[float]) -> bool
classmethod
¶Check that the percentages add up to a hundred.
Source code in taxpasta/infrastructure/application/kaiju/kaiju_profile.py
check_unique_filename(file_col: Series[str]) -> bool
classmethod
¶Check that Kaiju filename is unique.
Source code in taxpasta/infrastructure/application/kaiju/kaiju_profile.py
kaiju_profile_reader
¶Provide a reader for kaiju profiles.
KaijuProfileReader
¶
Bases: ProfileReader
Define a reader for kaiju profiles.
Source code in taxpasta/infrastructure/application/kaiju/kaiju_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[KaijuProfile]
classmethod
¶Read a kaiju taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by kaiju. |
required |
Returns:
Type | Description |
---|---|
DataFrame[KaijuProfile]
|
A data frame representation of the kaiju profile. |
Source code in taxpasta/infrastructure/application/kaiju/kaiju_profile_reader.py
kaiju_profile_standardisation_service
¶Provide a standardisation service for kaiju profiles.
KaijuProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for kaiju profiles.
Source code in taxpasta/infrastructure/application/kaiju/kaiju_profile_standardisation_service.py
transform(profile: DataFrame[KaijuProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given kaiju profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[KaijuProfile]
|
A taxonomic profile generated by kaiju. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in taxpasta/infrastructure/application/kaiju/kaiju_profile_standardisation_service.py
kraken2
¶
Classes¶
Modules¶
kraken2_profile
¶Provide a description of the kraken2 profile format.
Kraken2Profile
¶
Bases: pa.SchemaModel
Define the expected kraken2 profile format.
Source code in taxpasta/infrastructure/application/kraken2/kraken2_profile.py
clade_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
¶direct_assigned_reads: Series[int] = pa.Field(ge=0)
class-attribute
¶distinct_minimizers: Optional[Series[int]] = pa.Field(ge=0)
class-attribute
¶name: Series[str] = pa.Field()
class-attribute
¶num_minimizers: Optional[Series[int]] = pa.Field(ge=0)
class-attribute
¶percent: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
¶taxonomy_id: Series[int] = pa.Field(ge=0)
class-attribute
¶taxonomy_lvl: Series[pd.CategoricalDtype] = pa.Field()
class-attribute
¶Config
¶Configure the schema model.
Source code in taxpasta/infrastructure/application/kraken2/kraken2_profile.py
check_compositionality(profile: pd.DataFrame) -> bool
classmethod
¶Check that the percent of 'unclassified' and 'root' add up to a hundred.
Source code in taxpasta/infrastructure/application/kraken2/kraken2_profile.py
kraken2_profile_reader
¶Provide a reader for kraken2 profiles.
Kraken2ProfileReader
¶
Bases: ProfileReader
Define a reader for kraken2 profiles.
Source code in taxpasta/infrastructure/application/kraken2/kraken2_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[Kraken2Profile]
classmethod
¶Read a kraken2 taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by kraken2. |
required |
Returns:
Type | Description |
---|---|
DataFrame[Kraken2Profile]
|
A data frame representation of the kraken2 profile. |
Raises:
Type | Description |
---|---|
ValueError
|
In case the table does not contain exactly six or eight columns. |
Source code in taxpasta/infrastructure/application/kraken2/kraken2_profile_reader.py
kraken2_profile_standardisation_service
¶Provide a standardisation service for kraken2 profiles.
Kraken2ProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for kraken2 profiles.
Source code in taxpasta/infrastructure/application/kraken2/kraken2_profile_standardisation_service.py
transform(profile: DataFrame[Kraken2Profile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given kraken2 profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[Kraken2Profile]
|
A taxonomic profile generated by kraken2. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in taxpasta/infrastructure/application/kraken2/kraken2_profile_standardisation_service.py
krakenuniq
¶
Classes¶
Modules¶
krakenuniq_profile
¶Provide a description of the KrakenUniq profile format.
KrakenUniqProfile
¶
Bases: pa.SchemaModel
Define the expected KrakenUniq profile format.
Source code in taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile.py
coverage: Series[float] = pa.Field(ge=0.0, alias='cov')
class-attribute
¶duplicates: Series[float] = pa.Field(ge=0.0, alias='dup')
class-attribute
¶kmers: Series[int] = pa.Field(ge=0)
class-attribute
¶percent: Series[float] = pa.Field(ge=0.0, le=100.0, alias='%')
class-attribute
¶rank: Series[pd.CategoricalDtype] = pa.Field()
class-attribute
¶reads: Series[int] = pa.Field(ge=0)
class-attribute
¶tax_id: Series[int] = pa.Field(alias='taxID', ge=0)
class-attribute
¶tax_name: Series[str] = pa.Field(alias='taxName')
class-attribute
¶tax_reads: Series[int] = pa.Field(ge=0, alias='taxReads')
class-attribute
¶Config
¶Configure the schema model.
Source code in taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile.py
krakenuniq_profile_reader
¶Provide a reader for KrakenUniq profiles.
KrakenUniqProfileReader
¶
Bases: ProfileReader
Define a reader for KrakenUniq profiles.
Source code in taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[KrakenUniqProfile]
classmethod
¶Read a krakenUniq taxonomic profile from the given source.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
BufferOrFilepath
|
A source that contains a tab-separated taxonomic profile generated by KrakenUniq. |
required |
Returns:
Type | Description |
---|---|
DataFrame[KrakenUniqProfile]
|
A data frame representation of the KrakenUniq profile. |
Source code in taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile_reader.py
krakenuniq_profile_standardisation_service
¶Provide a standardisation service for KrakenUniq profiles.
KrakenUniqProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for krakenUniq profiles.
Source code in taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile_standardisation_service.py
transform(profile: DataFrame[KrakenUniqProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given krakenUniq profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[KrakenUniqProfile]
|
A taxonomic profile generated by KrakenUniq. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in taxpasta/infrastructure/application/krakenuniq/krakenuniq_profile_standardisation_service.py
megan6
¶
Classes¶
Modules¶
megan6_profile
¶Provide a description of the MEGAN6 rma2info profile format.
Megan6Profile
¶
Bases: pa.SchemaModel
Define the expected MEGAN6 rma2info profile format.
Source code in taxpasta/infrastructure/application/megan6/megan6_profile.py
megan6_profile_reader
¶Provide a reader for megan6 profiles.
Megan6ProfileReader
¶
Bases: ProfileReader
Define a reader for MEGAN6 rma2info profiles.
Source code in taxpasta/infrastructure/application/megan6/megan6_profile_reader.py
LARGE_INTEGER = int(10000000.0)
class-attribute
¶read(profile: BufferOrFilepath) -> DataFrame[Megan6Profile]
classmethod
¶Read a MEGAN6 rma2info taxonomic profile from a file.
Source code in taxpasta/infrastructure/application/megan6/megan6_profile_reader.py
megan6_profile_standardisation_service
¶Provide a standardisation service for megan6 profiles.
Megan6ProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for megan6 profiles.
Source code in taxpasta/infrastructure/application/megan6/megan6_profile_standardisation_service.py
transform(profile: DataFrame[Megan6Profile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given MEGAN6 rma2info profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[Megan6Profile]
|
A taxonomic profile generated by MEGAN6 rma2info. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in taxpasta/infrastructure/application/megan6/megan6_profile_standardisation_service.py
metaphlan
¶
Classes¶
Modules¶
metaphlan_profile
¶Provide a description of the metaphlan profile format.
MetaphlanProfile
¶
Bases: pa.SchemaModel
Define the expected metaphlan profile format.
Source code in taxpasta/infrastructure/application/metaphlan/metaphlan_profile.py
additional_species: Optional[Series[str]] = pa.Field(nullable=True)
class-attribute
¶clade_name: Series[str] = pa.Field()
class-attribute
¶ncbi_tax_id: Series[str] = pa.Field(alias='NCBI_tax_id')
class-attribute
¶relative_abundance: Series[float] = pa.Field(ge=0.0, le=100.0)
class-attribute
¶Config
¶Configure the schema model.
Source code in taxpasta/infrastructure/application/metaphlan/metaphlan_profile.py
check_compositionality(profile: pd.DataFrame) -> bool
classmethod
¶Check that the percentages per rank add up to a hundred.
Source code in taxpasta/infrastructure/application/metaphlan/metaphlan_profile.py
metaphlan_profile_reader
¶Provide a reader for metaphlan profiles.
MetaphlanProfileReader
¶
Bases: ProfileReader
Define a reader for Metaphlan profiles.
Source code in taxpasta/infrastructure/application/metaphlan/metaphlan_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[MetaphlanProfile]
classmethod
¶Read a metaphlan taxonomic profile from a file.
Source code in taxpasta/infrastructure/application/metaphlan/metaphlan_profile_reader.py
metaphlan_profile_standardisation_service
¶Provide a standardisation service for metaphlan profiles.
MetaphlanProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for metaphlan profiles.
Source code in taxpasta/infrastructure/application/metaphlan/metaphlan_profile_standardisation_service.py
LARGE_INTEGER = int(1000000.0)
class-attribute
¶transform(profile: DataFrame[MetaphlanProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given metaphlan profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[MetaphlanProfile]
|
A taxonomic profile generated by metaphlan. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in taxpasta/infrastructure/application/metaphlan/metaphlan_profile_standardisation_service.py
motus
¶
Classes¶
Modules¶
motus_profile
¶Provide a description of the mOTUs profile format.
MotusProfile
¶
Bases: pa.SchemaModel
Define the expected mOTUs profile format.
Source code in taxpasta/infrastructure/application/motus/motus_profile.py
consensus_taxonomy: Series[str] = pa.Field()
class-attribute
¶ncbi_tax_id: Series[str] = pa.Field(nullable=True)
class-attribute
¶read_count: Series[int] = pa.Field(ge=0)
class-attribute
¶Config
¶Configure the schema model.
Source code in taxpasta/infrastructure/application/motus/motus_profile.py
motus_profile_reader
¶Provide a reader for motus profiles.
MotusProfileReader
¶
Bases: ProfileReader
Define a reader for mOTUS profiles.
Source code in taxpasta/infrastructure/application/motus/motus_profile_reader.py
read(profile: BufferOrFilepath) -> DataFrame[MotusProfile]
classmethod
¶Read a mOTUs taxonomic profile from a file.
Source code in taxpasta/infrastructure/application/motus/motus_profile_reader.py
motus_profile_standardisation_service
¶Provide a standardisation service for mOTUs profiles.
MotusProfileStandardisationService
¶
Bases: ProfileStandardisationService
Define a standardisation service for mOTUs profiles.
Source code in taxpasta/infrastructure/application/motus/motus_profile_standardisation_service.py
transform(profile: DataFrame[MotusProfile]) -> DataFrame[StandardProfile]
classmethod
¶Tidy up and standardize a given mOTUs profile.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
DataFrame[MotusProfile]
|
A taxonomic profile generated by mOTUs. |
required |
Returns:
Type | Description |
---|---|
DataFrame[StandardProfile]
|
A standardized profile. |
Source code in taxpasta/infrastructure/application/motus/motus_profile_standardisation_service.py
sample_etl_application
¶
Provide a sample ETL application.
Attributes¶
logger = logging.getLogger(__name__)
module-attribute
¶Classes¶
SampleETLApplication
¶Define the sample ETL application.
Source code in taxpasta/infrastructure/application/sample_etl_application.py
reader = profile_reader
instance-attribute
¶standardiser = profile_standardiser
instance-attribute
¶__init__(*, profile_reader: Type[ProfileReader], profile_standardiser: Type[ProfileStandardisationService], **kwargs: dict)
¶Initialize the application for a particular taxonomic profiler.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile_reader |
Type[ProfileReader]
|
A profile reader for a specific taxonomic profile format. |
required |
profile_standardiser |
Type[ProfileStandardisationService]
|
A profile standardisation service for a specific taxonomic profile format. |
required |
**kwargs |
dict
|
Passed on for inheritance. |
{}
|
Source code in taxpasta/infrastructure/application/sample_etl_application.py
etl(profile: Path, name: Optional[str] = None) -> Sample
¶Extract, transform, and load a profile into a sample.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
profile |
Path
|
A taxonomic profile. |
required |
name |
Optional[str]
|
An optional name for the sample. Otherwise, the profile's filename is used. |
None
|
Returns:
Type | Description |
---|---|
Sample
|
A sample. |
Raises:
Type | Description |
---|---|
StandardisationError
|
If the given profile does not match the validation schema. # noqa: DAR402 |
Source code in taxpasta/infrastructure/application/sample_etl_application.py
sample_sheet
¶
Provide a description of samples and profile locations.
Classes¶
SampleSheet
¶
Bases: pa.SchemaModel
Define a description of samples and profile locations.
Source code in taxpasta/infrastructure/application/sample_sheet.py
profile: Series[str] = pa.Field()
class-attribute
¶sample: Series[str] = pa.Field()
class-attribute
¶Config
¶Configure the schema model.
Source code in taxpasta/infrastructure/application/sample_sheet.py
check_number_samples(table: DataFrame) -> bool
classmethod
¶Check that there are at least two samples.
check_profile_presence(profile: Series[str]) -> Series[bool]
classmethod
¶Check that every profile is present at the specified location.
Source code in taxpasta/infrastructure/application/sample_sheet.py
standard_profile_file_format
¶
standard_profile_writer
¶
Modules¶
arrow_standard_profile_writer
¶Provide an arrow writer.
ArrowStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the arrow writer.
Source code in taxpasta/infrastructure/application/standard_profile_writer/arrow_standard_profile_writer.py
write(profile: DataFrame[StandardProfile], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given standardized profile to the given buffer or file.
Source code in taxpasta/infrastructure/application/standard_profile_writer/arrow_standard_profile_writer.py
csv_standard_profile_writer
¶Provide a CSV writer.
CSVStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the CSV writer.
Source code in taxpasta/infrastructure/application/standard_profile_writer/csv_standard_profile_writer.py
write(profile: DataFrame[StandardProfile], target: BufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given standardized profile to the given buffer or file.
Source code in taxpasta/infrastructure/application/standard_profile_writer/csv_standard_profile_writer.py
ods_standard_profile_writer
¶Provide an ODS writer.
ODSStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the ODS writer.
Source code in taxpasta/infrastructure/application/standard_profile_writer/ods_standard_profile_writer.py
write(profile: DataFrame[StandardProfile], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given standardized profile to the given buffer or file.
Source code in taxpasta/infrastructure/application/standard_profile_writer/ods_standard_profile_writer.py
parquet_standard_profile_writer
¶Provide an parquet writer.
ParquetStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the parquet writer.
Source code in taxpasta/infrastructure/application/standard_profile_writer/parquet_standard_profile_writer.py
write(profile: DataFrame[StandardProfile], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given standardized profile to the given buffer or file.
Source code in taxpasta/infrastructure/application/standard_profile_writer/parquet_standard_profile_writer.py
tsv_standard_profile_writer
¶Provide an TSV writer.
TSVStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the TSV writer.
Source code in taxpasta/infrastructure/application/standard_profile_writer/tsv_standard_profile_writer.py
write(profile: DataFrame[StandardProfile], target: BufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given standardized profile to the given buffer or file.
Source code in taxpasta/infrastructure/application/standard_profile_writer/tsv_standard_profile_writer.py
xlsx_standard_profile_writer
¶Provide an XLSX writer.
XLSXStandardProfileWriter
¶
Bases: StandardProfileWriter
Define the XLSX writer.
Source code in taxpasta/infrastructure/application/standard_profile_writer/xlsx_standard_profile_writer.py
write(profile: DataFrame[StandardProfile], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given standardized profile to the given buffer or file.
Source code in taxpasta/infrastructure/application/standard_profile_writer/xlsx_standard_profile_writer.py
supported_profiler
¶
Provide an enumeration of supported taxonomic profilers.
Classes¶
SupportedProfiler
¶
Bases: str
, Enum
Define supported taxonomic profilers.
Source code in taxpasta/infrastructure/application/supported_profiler.py
bracken = 'bracken'
class-attribute
¶centrifuge = 'centrifuge'
class-attribute
¶diamond = 'diamond'
class-attribute
¶kaiju = 'kaiju'
class-attribute
¶kraken2 = 'kraken2'
class-attribute
¶krakenuniq = 'krakenuniq'
class-attribute
¶megan6 = 'megan6'
class-attribute
¶metaphlan = 'metaphlan'
class-attribute
¶motus = 'motus'
class-attribute
¶
table_reader
¶
Modules¶
arrow_table_reader
¶csv_table_reader
¶ods_table_reader
¶Provide an ODS reader.
parquet_table_reader
¶Provide an parquet reader.
tsv_table_reader
¶xlsx_table_reader
¶Provide an XLSX reader.
table_reader_file_format
¶
tidy_observation_table_file_format
¶
tidy_observation_table_writer
¶
Modules¶
arrow_table_writer
¶Provide an arrow writer.
ArrowTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the arrow writer.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/arrow_table_writer.py
write(table: DataFrame[TidyObservationTable], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/arrow_table_writer.py
csv_table_writer
¶Provide a CSV writer.
CSVTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the CSV writer.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/csv_table_writer.py
write(table: DataFrame[TidyObservationTable], target: BufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/csv_table_writer.py
ods_table_writer
¶Provide an ODS writer.
ODSTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the ODS writer.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/ods_table_writer.py
write(table: DataFrame[TidyObservationTable], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/ods_table_writer.py
parquet_table_writer
¶Provide an parquet writer.
ParquetTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the parquet writer.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/parquet_table_writer.py
write(table: DataFrame[TidyObservationTable], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/parquet_table_writer.py
tsv_table_writer
¶Provide an TSV writer.
TSVTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the TSV writer.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/tsv_table_writer.py
write(table: DataFrame[TidyObservationTable], target: BufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/tsv_table_writer.py
xlsx_table_writer
¶Provide an XLSX writer.
XLSXTidyObservationTableWriter
¶
Bases: TidyObservationTableWriter
Define the XLSX writer.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/xlsx_table_writer.py
write(table: DataFrame[TidyObservationTable], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/tidy_observation_table_writer/xlsx_table_writer.py
wide_observation_table_file_format
¶
wide_observation_table_writer
¶
Modules¶
arrow_wide_observation_table_writer
¶Provide an arrow writer.
ArrowWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the arrow writer.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/arrow_wide_observation_table_writer.py
write(matrix: DataFrame[WideObservationTable], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/arrow_wide_observation_table_writer.py
biom_wide_observation_table_writer
¶Provide a Biological Observation Matrix (BIOM) writer.
BIOMWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the Biological Observation Matrix (BIOM) writer.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/biom_wide_observation_table_writer.py
write(matrix: DataFrame[WideObservationTable], target: Filepath, taxonomy: Optional[Taxonomy] = None, generated_by: str = 'taxpasta', **kwargs) -> None
classmethod
¶Write the given data to the given buffer or file.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/biom_wide_observation_table_writer.py
csv_wide_observation_table_writer
¶Provide a CSV writer.
CSVWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the CSV writer.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/csv_wide_observation_table_writer.py
write(matrix: DataFrame[WideObservationTable], target: BufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/csv_wide_observation_table_writer.py
ods_wide_observation_table_writer
¶Provide an ODS writer.
ODSWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the ODS writer.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/ods_wide_observation_table_writer.py
write(matrix: DataFrame[WideObservationTable], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/ods_wide_observation_table_writer.py
parquet_wide_observation_table_writer
¶Provide an parquet writer.
ParquetWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the parquet writer.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/parquet_wide_observation_table_writer.py
write(matrix: DataFrame[WideObservationTable], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/parquet_wide_observation_table_writer.py
tsv_wide_observation_table_writer
¶Provide an TSV writer.
TSVWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the TSV writer.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/tsv_wide_observation_table_writer.py
write(matrix: DataFrame[WideObservationTable], target: BufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/tsv_wide_observation_table_writer.py
xlsx_wide_observation_table_writer
¶Provide an XLSX writer.
XLSXWideObservationTableWriter
¶
Bases: WideObservationTableWriter
Define the XLSX writer.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/xlsx_wide_observation_table_writer.py
write(matrix: DataFrame[WideObservationTable], target: BinaryBufferOrFilepath, taxonomy: Optional[Taxonomy] = None, **kwargs) -> None
classmethod
¶Write the given table to the given buffer or file.
Source code in taxpasta/infrastructure/application/wide_observation_table_writer/xlsx_wide_observation_table_writer.py
cli
¶
Attributes¶
Modules¶
consensus
¶
merge
¶
Add the merge
command to the taxpasta CLI.
Attributes¶
logger = logging.getLogger(__name__)
module-attribute
¶Classes¶
Functions¶
merge(profiles: Optional[List[Path]] = typer.Argument(None, metavar='[PROFILE1 PROFILE2 [...]]', help='Two or more files containing taxonomic profiles. Required unless there is a sample sheet. Filenames will be parsed as sample names.', show_default=False), profiler: SupportedProfiler = typer.Option(Ellipsis, '--profiler', '-p', case_sensitive=False, help='The taxonomic profiler used. All provided profiles must come from the same tool!', show_default=False), sample_sheet: Optional[Path] = typer.Option(None, '--samplesheet', '-s', help="A table with a header and two columns: the first column named 'sample' which can be any string and the second column named 'profile' which must be a file path to an actual taxonomic abundance profile. If this option is provided, any arguments are ignored.", exists=True, file_okay=True, dir_okay=False, readable=True), samplesheet_format: Optional[TableReaderFileFormat] = typer.Option(None, case_sensitive=False, help='The file format of the sample sheet. Depending on the choice, additional package dependencies may apply. Will be parsed from the sample sheet file name but can be set explicitly.'), output: Path = typer.Option(Ellipsis, '--output', '-o', help='The desired output file. By default, the file extension will be used to determine the output format.', show_default=False), output_format: Optional[WideObservationTableFileFormat] = typer.Option(None, case_sensitive=False, help='The desired output format. Depending on the choice, additional package dependencies may apply. Will be parsed from the output file name but can be set explicitly.'), wide_format: bool = typer.Option(True, '--wide/--long', help='Output merged abundance data in either wide or (tidy) long format. Ignored when the desired output format is BIOM.'))
¶Standardise and merge two or more taxonomic profiles into a single table.
Source code in taxpasta/infrastructure/cli/merge.py
200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 |
|
read_sample_sheet(sample_sheet: Path, sample_format: TableReaderFileFormat) -> DataFrame[SampleSheet]
¶Extract and validate the sample sheet.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample_sheet |
Path
|
Path to the sample sheet. |
required |
sample_format |
TableReaderFileFormat
|
The determined file format. |
required |
Returns:
Type | Description |
---|---|
DataFrame[SampleSheet]
|
A pandas data frame in the form of a sample sheet. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when there is a schema error. |
Source code in taxpasta/infrastructure/cli/merge.py
validate_observation_matrix_format(output: Path, output_format: Optional[str]) -> WideObservationTableFileFormat
¶Detect the output format if it isn't given.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output |
Path
|
Path for the output. |
required |
output_format |
Optional[str]
|
The selected file format if any. |
required |
Returns:
Type | Description |
---|---|
WideObservationTableFileFormat
|
The validated output file format. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when the format cannot be guessed or dependencies are missing. |
Source code in taxpasta/infrastructure/cli/merge.py
validate_sample_format(sample_sheet: Path, sample_format: Optional[TableReaderFileFormat]) -> TableReaderFileFormat
¶Detect the sample sheet format if it isn't given.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sample_sheet |
Path
|
Path to the sample sheet. |
required |
sample_format |
Optional[TableReaderFileFormat]
|
The selected file format if any. |
required |
Returns:
Type | Description |
---|---|
TableReaderFileFormat
|
The validated sample sheet format. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when the format cannot be guessed or dependencies are missing. |
Source code in taxpasta/infrastructure/cli/merge.py
validate_tidy_observation_table_format(output: Path, output_format: Optional[str]) -> TidyObservationTableFileFormat
¶Detect the output format if it isn't given.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output |
Path
|
Path for the output. |
required |
output_format |
Optional[str]
|
The selected file format if any. |
required |
Returns:
Type | Description |
---|---|
TidyObservationTableFileFormat
|
The validated output file format. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when the format cannot be guessed or dependencies are missing. |
Source code in taxpasta/infrastructure/cli/merge.py
standardise
¶
Add the standardize
command to the taxpasta CLI.
Attributes¶
logger = logging.getLogger(__name__)
module-attribute
¶Classes¶
Functions¶
standardise(profile: Path = typer.Argument(Ellipsis, metavar='PROFILE', help='A file containing a taxonomic profile.', show_default=False), profiler: SupportedProfiler = typer.Option(Ellipsis, '--profiler', '-p', case_sensitive=False, help='The taxonomic profiler used.', show_default=False), output: Path = typer.Option(Ellipsis, '--output', '-o', help='The desired output file. By default, the file extension will be used to determine the output format.', show_default=False), output_format: Optional[StandardProfileFileFormat] = typer.Option(None, case_sensitive=False, help='The desired output format. Depending on the choice, additional package dependencies may apply. Will be parsed from the output file name but can be set explicitly.'))
¶Standardise a taxonomic profile.
Source code in taxpasta/infrastructure/cli/standardise.py
validate_output_format(output: Path, output_format: Optional[str]) -> StandardProfileFileFormat
¶Detect the output format if it isn't given.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
output |
Path
|
Path for the output. |
required |
output_format |
Optional[str]
|
The selected file format if any. |
required |
Returns:
Type | Description |
---|---|
StandardProfileFileFormat
|
The validated output file format. |
Raises:
Type | Description |
---|---|
Exit
|
Early abortion of program when the format cannot be guessed or dependencies are missing. |
Source code in taxpasta/infrastructure/cli/standardise.py
taxpasta
¶
Provide a command-line interface (CLI) for taxpasta functionality.
Attributes¶
app = typer.Typer(help='TAXonomic Profile Aggregation and STAndardisation', context_settings={'help_option_names': ['-h', '--help']})
module-attribute
¶logger = logging.getLogger('taxpasta')
module-attribute
¶Classes¶
LogLevel
¶
Bases: str
, Enum
Define the choices for the log level option.
Source code in taxpasta/infrastructure/cli/taxpasta.py
Functions¶
initialize(context: typer.Context, version: Optional[bool] = typer.Option(None, '--version', callback=version_callback, is_eager=True, help='Print only the current tool version and exit.'), log_level: LogLevel = typer.Option(LogLevel.INFO.name, '--log-level', '-l', case_sensitive=False, help='Set the desired log level.'))
¶Initialize logging and rich printing if available.
Source code in taxpasta/infrastructure/cli/taxpasta.py
version_callback(is_set: bool) -> None
¶Print the tool version if desired.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
is_set |
bool
|
Whether the version was requested as a command line option. |
required |
Raises:
Type | Description |
---|---|
Exit
|
With default code 0 to signal normal program end. |