Opened 2 years ago
Closed 20 months ago
#17870 closed enhancement (fixed)
Possible: provide options to install less of texlive
Reported by: | Owned by: | ||
---|---|---|---|
Priority: | normal | Milestone: | 12.0 |
Component: | BOOK | Version: | git |
Severity: | normal | Keywords: | |
Cc: |
Description ¶
It would be nice to allow users to install a smaller part of the current year's texlive.
Looking at gentoo, they have a lot of different configure options to allow conditional compilation of various parts - but at this stage I have no idea how they handle the related texmf parts - and it is texmf which uses the space. I suspect they include texmf in their source, and I assume that will make it unmanageably large for us, and for our mirrors.
Looking at Arch, all their builds start from the binary install-tl-unix and install parts of into /usr, with various things to support their packaging. There is an AUR texlive full which installs in /opt, but it too uses the binary.
My understanding is that debian (and derivatives) compile from source and install parts, similarly fedora.
If providing instructions for a smaller build in BLFS, the builder will apparently need to have space for the untarred texmf tarball and then copy the necessary parts. That is true even for a minimal (non-latex) install.
Change History (9)
comment:1 by , 2 years ago
comment:2 by , 2 years ago
In the end, I came to understand that there is no easy way in a BLFS source build to disable most individual programs, let alone prevent the associated texmf-dist files fro mbeing installed (and doing a DESTDIR install in the absence of texmf showed that some scripts were installed - probably those where there is a symlink from the program directory.
The tlpdb database can be parsed, at least in theory, to establish which packages are in which scheme, and what is in each package (there are a couple of operations I do not grok for packaging less than all of the hypen files, but keeping all will make only a very small difference). BUT - texlive is written on GNU principles - all the source and scripts needed for recreating things. Unless you are deep into TeX development you will not usually want things such as the source for tfm fonts.
I have some notes, when I come back to this I will be thinking about providing a separate file, probably plain text in ~/ken, of some considerations for removing items after the install. Some people might prefer to remove all documentation (fine only if you are always online and your preferred search engine is always working). Many people will not need all the fonts. But what to remove is a very individual decision and people will need to review individual items before deciding.
comment:3 by , 21 months ago
Status update:
Since my example is only useful to my exact usage, although it hopefully will contain enough details to guide anyone who wants to take a similar approach to removing unnecessary things, I've been working on trying to parse the tlpdb. This is certainly straining my brain (using perl, bash is just too slow). Documenting current state of play now.
My intention was to work out what is needed, then create working copies of the required files and then tar those up and use them to replace the full install.
There is documentation on it at https://tug.org/TUGboat/tb34-3/tb108preining-distro.pdf but it might be out of date. My current understanding:
- There are TLCore items which provide files and programs.
- There are a number of schemes. I have been working on scheme-medium (there are a couple of larger schemes as well as scheme-full which is what we build from source.
- A scheme depends on a number of collections.
- A collection may depend on other collections (e.g. collection-langchinese depends on collection-langcjk). A couple of collections in scheme-medium have such dependencies, but in fact they were also in the main dependencies. Looking at the file, it semes two passes should always be enough.
- A collection depends on packages. Every package is in exactly one collection, but what I had not expected was that a package can depend on other packages. Currently stalled here, there might be too many items to review.
- Apart from this, a package may contain texmf-dist/ items OR RELOC/ items which appear to be for texmf-dist. It can also depend on programs for an ARCH (x86_64-linux in our case).
There are certain limitations from cutting down after doing a full install, in particular :
(i.) the full install includes all hyphenation, other schemes have less of this.
(ii.) the shipped updmap.cfg contains all fonts, instead of only those that are present. My cut-down version similarly does this, but since the formats have already been created it does no harm.
(iii.) tlmgr cannot be used, obviously.
(iv.) Only copying the specified programs from the full build into the work area might result in broken symlinks. Care will be needed. Also, removing compiled programs risks having to reinstall all of texlive if too much is accidentally removed.
I'm going to look at the first few packages which were reported multiple times in my files, to see if they all come from the same collection (or from TLCore).
comment:4 by , 21 months ago
Continuing, with the assumption that packages depend on packages that are either in the same collection, or are in TLCore. I think the main reason for the depend items is to enable people to add packages and automatically pull in missing dependencies.
comment:5 by , 21 months ago
Got to the point where I can list the texmf-dist files and the programs my script reports as part of scheme-medium, and compare those to what is in the binary for scheme-medium from when TL2023 was released.
I have more programs (329) than are in the binary (125) so something is seriously wrong in that part.
For the files, I have occasional missing files, unwanted doc/generic/elhyphen (not in the binary) and missing all of doc/generic/pgf. I stopped comparing the list of files at that point. My attempt to parse the tlpdb to reduce the full source install towards an arbitrary scheme install is going nowhere and now abandonned.
follow-up: 7 comment:6 by , 21 months ago
I generally do a binary install that is relatively small because I really only wanted things like tex, latex, dvips, etc. Here are some stats:
$ du -sh /mnt/texlive/ 1.3G /mnt/texlive/ $ find /mnt/texlive/ |wc -l 36234 $ ls -l /mnt/texlive/2022/bin/x86_64-linux/|wc -l 249
Of that last number, 164 were symlinks to places like ../../texmf-dist/scripts. That directory has a size of 110 MB.
comment:7 by , 21 months ago
Replying to Bruce Dubbs:
I generally do a binary install that is relatively small because I really only wanted things like tex, latex, dvips, etc. Here are some stats:
$ du -sh /mnt/texlive/ 1.3G /mnt/texlive/ $ find /mnt/texlive/ |wc -l 36234 $ ls -l /mnt/texlive/2022/bin/x86_64-linux/|wc -l 249Of that last number, 164 were symlinks to places like ../../texmf-dist/scripts. That directory has a size of 110 MB.
I was aware of why you use that approach, I'd hoped to be able to offer something similar by removing things from the full source build, but doing that is currently beyond me - I mostly script in bash, far too slow to read all of texlive.tlpdb, and my rusty perl has led to some fatal bugs of omitting some files and adding others.
My use-case is documenting what fonts can do (if I ever get back to that) and trying to maintain our source builds: for that I have gradually increased my tex documents. There is a lot I don't need, other things I will only look at if I'm online and stumble across explanations of why|how to use them.
A hint is forthcoming, I've just uploaded my detailed thoughts on what *I* need and what I can do without, plus comments on testing and the diminishing returns from looking at removing certain items, https://www.linuxfromscratch.org/~ken/TL2023/reduced-2023-texmf.txt
comment:8 by , 20 months ago
Milestone: | x-future → 99-Waiting |
---|
Updated and spell-checked version of hint has been submitted, keeping this open until hint is accepted.
comment:9 by , 20 months ago
Milestone: | 99-Waiting → 12.0 |
---|---|
Resolution: | → fixed |
Status: | assigned → closed |
I've spent a little time looking at this, and in particular looking at only installing only plain tex. Along the way I've learned the following:
Therefore my current view is that I should review all the texlive --disable configure switches to determine what they remove, eventually document them, and then work out what people might wish to remove after the full install before removing broken symlinks (find (directory) -xtype l | xargs rm -v) and then rerunning mktexlsr. (on this machine '-type l' renders as if it is a vertical bar, it is actually a lowercase L)
At the moment I'm thinking that some things should always be disabled in the BLFS configure (e.g. legacy parts of omega, aleph) and adding some others as optional configure switches. And then documenting post-install removal of unwanted things. But that might upset our users (you never know who is using BLFS as long as everything is fine), so I'll post on blfs-dev, and (depending on results on blfs-dev) on blfs-support.
Obsessive, me ? Yes, I think I am. Although my current (25GB+) systems usually have space for installing texlive from source, I begrudge all the items I don't really need and they add pressure to my backups.