Safe Haskell | None |
---|---|
Language | Haskell2010 |
This module is about stream-processing tar archives. It is currently
not very well tested. See the documentation of withEntries
for an usage sample.
Synopsis
- tar :: MonadThrow m => ConduitM (Either FileInfo ByteString) ByteString m FileOffset
- tarEntries :: MonadThrow m => ConduitM (Either Header ByteString) ByteString m FileOffset
- untar :: MonadThrow m => (FileInfo -> ConduitM ByteString o m ()) -> ConduitM ByteString o m ()
- untarWithFinalizers :: (MonadThrow m, MonadIO m) => (FileInfo -> ConduitM ByteString (IO ()) m ()) -> ConduitM ByteString c m ()
- untarWithExceptions :: (MonadThrow m, MonadIO m) => (FileInfo -> ConduitM ByteString (IO (FileInfo, [SomeException])) m ()) -> ConduitM ByteString c m [(FileInfo, [SomeException])]
- restoreFile :: MonadResource m => FileInfo -> ConduitM ByteString (IO ()) m ()
- restoreFileInto :: MonadResource m => FilePath -> FileInfo -> ConduitM ByteString (IO ()) m ()
- restoreFileIntoLenient :: MonadResource m => FilePath -> FileInfo -> ConduitM ByteString (IO (FileInfo, [SomeException])) m ()
- restoreFileWithErrors :: MonadResource m => Bool -> FileInfo -> ConduitM ByteString (IO (FileInfo, [SomeException])) m ()
- untarChunks :: Monad m => ConduitM ByteString TarChunk m ()
- withEntry :: MonadThrow m => (Header -> ConduitM ByteString o m r) -> ConduitM TarChunk o m r
- withEntries :: MonadThrow m => (Header -> ConduitM ByteString o m ()) -> ConduitM TarChunk o m ()
- withFileInfo :: MonadThrow m => (FileInfo -> ConduitM ByteString o m ()) -> ConduitM TarChunk o m ()
- headerFileType :: Header -> FileType
- headerFilePath :: Header -> FilePath
- tarFilePath :: (MonadThrow m, MonadResource m) => ConduitM FilePath ByteString m FileOffset
- filePathConduit :: (MonadThrow m, MonadResource m) => ConduitM FilePath (Either FileInfo ByteString) m ()
- createTarball :: FilePath -> [FilePath] -> IO ()
- writeTarball :: Handle -> [FilePath] -> IO ()
- extractTarball :: FilePath -> Maybe FilePath -> IO ()
- extractTarballLenient :: FilePath -> Maybe FilePath -> IO [(FileInfo, [SomeException])]
- module Data.Conduit.Tar.Types
Basic functions
tar :: MonadThrow m => ConduitM (Either FileInfo ByteString) ByteString m FileOffset Source #
Create a tar archive by suppying a stream of Left
FileInfo
s. Whenever a
file type is FTNormal
, it must be immediately followed by its content as
Right
ByteString
. The produced ByteString
is in the raw tar format and
is properly terminated at the end, therefore it can not be extended
afterwards. Returned is the total size of the bytestring as a FileOffset
.
Since: 0.2.0
tarEntries :: MonadThrow m => ConduitM (Either Header ByteString) ByteString m FileOffset Source #
untar :: MonadThrow m => (FileInfo -> ConduitM ByteString o m ()) -> ConduitM ByteString o m () Source #
Just like withFileInfo
, but works directly on the stream of bytes.
Since: 0.2.0
untarWithFinalizers :: (MonadThrow m, MonadIO m) => (FileInfo -> ConduitM ByteString (IO ()) m ()) -> ConduitM ByteString c m () Source #
Just like untar
, except that each FileInfo
handling function can produce a finalizing
action, all of which will be executed after the whole tarball has been processed in the opposite
order. Very useful with restoreFile
and restoreFileInto
, since they restore direcory
modification timestamps only after files have been fully written to disk.
Since: 0.2.0
untarWithExceptions :: (MonadThrow m, MonadIO m) => (FileInfo -> ConduitM ByteString (IO (FileInfo, [SomeException])) m ()) -> ConduitM ByteString c m [(FileInfo, [SomeException])] Source #
Same as untarWithFinalizers
, but will also produce a list of any exceptions that might have
occured during restoration process.
Since: 0.2.5
restoreFile :: MonadResource m => FileInfo -> ConduitM ByteString (IO ()) m () Source #
Restore files onto the file system. Produces actions that will set the modification time on the directories, which can be executed after the pipeline has finished and all files have been written to disk.
restoreFileInto :: MonadResource m => FilePath -> FileInfo -> ConduitM ByteString (IO ()) m () Source #
Restore all files into a folder. Absolute file paths will be turned into relative to the supplied folder.
restoreFileIntoLenient :: MonadResource m => FilePath -> FileInfo -> ConduitM ByteString (IO (FileInfo, [SomeException])) m () Source #
Restore all files into a folder. Absolute file paths will be turned into relative to the supplied folder. Yields a list with exceptions instead of throwing them.
Since: 0.2.5
restoreFileWithErrors Source #
:: MonadResource m | |
=> Bool | Lenient flag, results in exceptions thrown instead of collected when set to |
-> FileInfo | |
-> ConduitM ByteString (IO (FileInfo, [SomeException])) m () |
Restore files onto the file system, much in the same way restoreFile
does it, except with
ability to ignore restoring problematic files and report errors that occured as a list of
exceptions, which will be returned as a list when finilizer executed. If a list is empty, it
means, that no errors occured and a file only had a finilizer associated with it.
Since: 0.2.4
Operate on Chunks
untarChunks :: Monad m => ConduitM ByteString TarChunk m () Source #
Convert a stream of raw bytes into a stream of TarChunk
s. This stream can further be passed
into withFileInfo
or withHeaders
functions.
Since: 0.2.1
withEntry :: MonadThrow m => (Header -> ConduitM ByteString o m r) -> ConduitM TarChunk o m r Source #
Process a single tar entry. See withEntries
for more details.
Since: 0.1.0
withEntries :: MonadThrow m => (Header -> ConduitM ByteString o m ()) -> ConduitM TarChunk o m () Source #
This function handles each entry of the tar archive according to the behaviour of the function passed as first argument.
Here is a full example function, that reads a compressed tar archive and for each entry that is a simple file, it prints its file path and SHA256 digest. Note that this function can throw exceptions!
import qualified Crypto.Hash.Conduit as CH import qualified Data.Conduit.Tar as CT import Conduit import Crypto.Hash (Digest, SHA256) import Control.Monad (when) import Data.Conduit.Zlib (ungzip) import Data.ByteString (ByteString) filedigests :: FilePath -> IO () filedigests fp = runConduitRes ( sourceFileBS fp -- read the raw file .| ungzip -- gunzip .| CT.untarChunks -- decode the tar archive .| CT.withEntries hashentry -- process each file .| printC -- print the results ) where hashentry :: Monad m => CT.Header -> Conduit ByteString m (FilePath, Digest SHA256) hashentry hdr = when (CT.headerFileType hdr == CT.FTNormal) $ do hash <- CH.sinkHash yield (CT.headerFilePath hdr, hash)
The hashentry
function handles a single entry, based on its first Header
argument.
In this example, a Consumer
is used to process the whole entry.
Note that the benefits of stream processing are easily lost when working with a Consumer
. For
example, the following implementation would have used an unbounded amount of memory:
hashentry hdr = when (CT.headerFileType hdr == CT.FTNormal) $ do content <- mconcat <$> sinkList yield (CT.headerFilePath hdr, hash content)
Since: 0.1.0
withFileInfo :: MonadThrow m => (FileInfo -> ConduitM ByteString o m ()) -> ConduitM TarChunk o m () Source #
Extract a tarball, similarly to withEntries
, but instead of dealing directly with tar format,
this conduit allows you to work directly on file abstractions FileInfo
. For now support is
minimal:
- Old v7 tar format.
- ustar: POSIX 1003.1-1988 format
- and only some portions of GNU format:
- Larger values for
fileUserId
,fileGroupId
,fileSize
andfileModTime
. L
type - long file names, but only up to 4096 chars to prevent DoS attack- other types are simply discarded
Note - Here is a really good reference for specifics of different tar formats: https://github.com/libarchive/libarchive/wiki/ManPageTar5
Since: 0.2.2
Helper functions
headerFileType :: Header -> FileType Source #
Get Header file type.
Since: 0.1.0
headerFilePath :: Header -> FilePath Source #
Construct a FilePath
from headerFileNamePrefix
and headerFileNameSuffix
.
Since: 0.1.0
Creation
tarFilePath :: (MonadThrow m, MonadResource m) => ConduitM FilePath ByteString m FileOffset Source #
Recursively tar all of the files and directories. There will be no
conversion between relative and absolute paths, so just like with GNU tar
cli tool, it may be necessary to setCurrentDirectory
in order to get the
paths relative. Using filePathConduit
directly, while modifying the
filePath
, would be another approach to handling the file paths.
Since: 0.2.0
filePathConduit :: (MonadThrow m, MonadResource m) => ConduitM FilePath (Either FileInfo ByteString) m () Source #
Turn a stream of file paths into a stream of FileInfo
and file
content. All paths will be decended into recursively.
Since: 0.2.0
Directly on files
:: FilePath | File name for the tarball |
-> [FilePath] | List of files and directories to include in the tarball |
-> IO () |
Uses tarFilePath
to create a tarball, that will recursively include the
supplied list of all the files and directories
Since: 0.2.0
:: Handle | Handle where created tarball will be written to |
-> [FilePath] | List of files and directories to include in the tarball |
-> IO () |
Take a list of files and paths, recursively tar them and write output into supplied handle.
Since: 0.2.0
:: FilePath | Filename for the tarball |
-> Maybe FilePath | Folder where tarball should be extract to. Default is the current path |
-> IO () |
Extract a tarball while using restoreFileInfo
for writing files onto the file
system. Restoration process is cross platform and should work concistently both on Windows and
Posix systems.
Since: 0.2.0
extractTarballLenient Source #
:: FilePath | Filename for the tarball |
-> Maybe FilePath | Folder where tarball should be extract to. Default is the current path |
-> IO [(FileInfo, [SomeException])] |
Same as extractTarball
, but ignores possible extraction errors. It can still throw a
TarException
if the tarball is corrupt or malformed.
Since: 0.2.5
Types
module Data.Conduit.Tar.Types