Class AbstractBatchedObjectColumnProcessor<T extends Context>

All Implemented Interfaces:
ConversionProcessor, Processor<T>
Direct Known Subclasses:
BatchedObjectColumnProcessor

public abstract class AbstractBatchedObjectColumnProcessor<T extends Context> extends AbstractObjectProcessor<T> implements Processor<T>
A Processor implementation for converting batches of rows extracted from any implementation of AbstractParser into columns of objects.

This uses the value conversions provided by Conversion instances.

For each row processed, a sequence of conversions will be executed to generate the appropriate object. Each resulting object will then be stored in a list that contains the values of the corresponding column.

During the execution of the process, the batchProcessed(int) method will be invoked after a given number of rows has been processed.

The user can access the lists with values parsed for all columns using the methods getColumnValuesAsList(), getColumnValuesAsMapOfIndexes() and getColumnValuesAsMapOfNames().

After batchProcessed(int) is invoked, all values will be discarded and the next batch of column values will be accumulated. This process will repeat until there's no more rows in the input.

Author:
Univocity Software Pty Ltd - parsers@univocity.com
See Also:
  • Constructor Details

    • AbstractBatchedObjectColumnProcessor

      public AbstractBatchedObjectColumnProcessor(int rowsPerBatch)
      Constructs a abstract batched column processor configured to invoke the batchesProcessed method after a given number of rows has been processed.
      Parameters:
      rowsPerBatch - the number of rows to process in each batch.
  • Method Details

    • processStarted

      public void processStarted(T context)
      Description copied from interface: Processor
      This method will by invoked by the parser once, when it is ready to start processing the input.
      Specified by:
      processStarted in interface Processor<T extends Context>
      Overrides:
      processStarted in class AbstractObjectProcessor<T extends Context>
      Parameters:
      context - A contextual object with information and controls over the current state of the parsing process
    • rowProcessed

      public void rowProcessed(Object[] row, T context)
      Description copied from class: AbstractObjectProcessor
      Invoked by the processor after all values of a valid record have been processed and converted into an Object array.
      Specified by:
      rowProcessed in class AbstractObjectProcessor<T extends Context>
      Parameters:
      row - object array created with the information extracted by the parser and then converted.
      context - A contextual object with information and controls over the current state of the parsing process
    • processEnded

      public void processEnded(T context)
      Description copied from interface: Processor
      This method will by invoked by the parser once, after the parsing process stopped and all resources were closed.

      It will always be called by the parser: in case of errors, if the end of the input us reached, or if the user stopped the process manually using Context.stop().

      Specified by:
      processEnded in interface Processor<T extends Context>
      Overrides:
      processEnded in class AbstractObjectProcessor<T extends Context>
      Parameters:
      context - A contextual object with information and controls over the state of the parsing process
    • getHeaders

      public final String[] getHeaders()
    • getColumnValuesAsList

      public final List<List<Object>> getColumnValuesAsList()
    • putColumnValuesInMapOfNames

      public final void putColumnValuesInMapOfNames(Map<String,List<Object>> map)
    • putColumnValuesInMapOfIndexes

      public final void putColumnValuesInMapOfIndexes(Map<Integer,List<Object>> map)
    • getColumnValuesAsMapOfNames

      public final Map<String,List<Object>> getColumnValuesAsMapOfNames()
    • getColumnValuesAsMapOfIndexes

      public final Map<Integer,List<Object>> getColumnValuesAsMapOfIndexes()
    • getColumn

      public List<Object> getColumn(String columnName)
    • getColumn

      public List<Object> getColumn(int columnIndex)
    • getColumn

      public <V> List<V> getColumn(String columnName, Class<V> columnType)
      Returns the values of a given column.
      Type Parameters:
      V - the type of data in that column
      Parameters:
      columnName - the name of the column in the input.
      columnType - the type of data in that column
      Returns:
      a list with all data stored in the given column
    • getColumn

      public <V> List<V> getColumn(int columnIndex, Class<V> columnType)
      Returns the values of a given column.
      Type Parameters:
      V - the type of data in that column
      Parameters:
      columnIndex - the position of the column in the input (0-based).
      columnType - the type of data in that column
      Returns:
      a list with all data stored in the given column
    • getRowsPerBatch

      public int getRowsPerBatch()
    • getBatchesProcessed

      public int getBatchesProcessed()
    • batchProcessed

      public abstract void batchProcessed(int rowsInThisBatch)