Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce images size #1636

Merged
merged 37 commits into from
Jun 26, 2020
Merged

Reduce images size #1636

merged 37 commits into from
Jun 26, 2020

Conversation

Aschen
Copy link
Contributor

@Aschen Aschen commented May 24, 2020

Reduce image size

⚠️ Depends on #1635

Reduce Kuzzle images size:

  • kuzzleio/kuzzle:2: from 563 Mo to 165 Mo
  • kuzzleio/plugin-dev:2: from 968 Mo to 410 Mo

This PR also includes 2 new tags:

  • kuzzleio/kuzzle:2-alpine: 140 Mo
  • kuzzleio/kuzzle:2-scratch: 94 Mo

I followed the tricks explained here: https://medium.com/trendyol-tech/how-we-reduce-node-docker-image-size-in-3-steps-ff2762b51d5a

Also I made few other changes for the production image:

  • remove all object and build files (.o, .h)
  • add only relevant files (lib, config, plugins, etc)
  • remove .git
  • remove useless npm dependencies
  • minify JS
  • strip re2 and node binaries

The scratch image contains only Node.js (with libstdc++), musl and Kuzzle source code.

Scripts

All docker related scripts are in docker/scripts/:

  • entrypoint.sh: the entrypoint
  • clean-node.sh: remove every unnecessary file from a directory. If NODE_ENV=production then the JS and JSON files will be minified
  • run-dev.sh: used by the core-dev image to run a core-developer stack
  • run-plugin.sh: used by plugin-dev to run an application composed of plugins (like we do now)
  • run-prod.sh: used by the kuzzle images to run kuzzle in production

How should this be manually tested?

Build the images and play the functional tests

Other changes

  • Elasticsearch initialization is now waited with the bin/wait-elasticsearch script instead of curl
  • Functionals tests are run in parallel in 2 travis jobs

@Aschen
Copy link
Contributor Author

Aschen commented May 24, 2020

I really struggled with the .dockerignore file, it does not seem to work if anyone know why

@Aschen Aschen changed the title Reduce production image size Reduce images size May 25, 2020
@Aschen Aschen marked this pull request as draft May 25, 2020 07:10
@Aschen Aschen marked this pull request as ready for review May 27, 2020 11:47
Copy link
Member

@alexandrebouthinon alexandrebouthinon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update kuzzle directory location in Docker image is a breaking change. Otherwise very nice job 👍

@@ -10,8 +10,8 @@ ARG SSH_KEY
ENV GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no"
ENV NODE_ENV=production

WORKDIR /var/app
COPY . /var/app
WORKDIR /app
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a breaking change

}
};

run();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
run();
if (require.main === module) {
run();
}

Else, will be called on require (i.e. from kuzzle-start)

&& ln -s /app/plugins/available/kuzzle-plugin-logger /app/plugins/enabled/kuzzle-plugin-logger \
# Install plugins dependencies
&& for plugin in plugins/enabled/*; do cd "$plugin"; npm install --production; cd /app; done \
&& for plugin in plugins/available/*; do cd "$plugin"; npm install --production; cd /app; done \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
&& for plugin in plugins/available/*; do cd "$plugin"; npm install --production; cd /app; done \
&& for plugin in plugins/*/*; do npm install --production --prefix "$plugin"; done \


# Minify JS & JSON
# I wasn't able to find a portable way to do string comparison with if
( echo $NODE_ENV | grep production ) || exit 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
( echo $NODE_ENV | grep production ) || exit 0
[ "${NODE_ENV}" = "production" ] || exit 0

This is posix

Copy link
Member

@alexandrebouthinon alexandrebouthinon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used Shellcheck on scripts, there it is some of the easiest warning/error to correct 🙂

Comment on lines 94 to 98
promise_run docker_build 'kuzzle' $TRAVIS_BRANCH 'kuzzle'
promise_then docker_push 'kuzzle' $TRAVIS_BRANCH

promise_run docker_build 'kuzzle' $TRAVIS_BRANCH-alpine 'kuzzle.alpine'
promise_then docker_push 'kuzzle' $TRAVIS_BRANCH-alpine
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
promise_run docker_build 'kuzzle' $TRAVIS_BRANCH 'kuzzle'
promise_then docker_push 'kuzzle' $TRAVIS_BRANCH
promise_run docker_build 'kuzzle' $TRAVIS_BRANCH-alpine 'kuzzle.alpine'
promise_then docker_push 'kuzzle' $TRAVIS_BRANCH-alpine
promise_run docker_build 'kuzzle' "$TRAVIS_BRANCH" 'kuzzle'
promise_then docker_push 'kuzzle' "$TRAVIS_BRANCH"
promise_run docker_build 'kuzzle' "$TRAVIS_BRANCH-alpine" 'kuzzle.alpine'
promise_then docker_push 'kuzzle' "$TRAVIS_BRANCH-alpine"

Comment on lines 91 to 92
promise_run docker_build 'plugin-dev' $TRAVIS_BRANCH 'plugin-dev'
promise_then docker_push 'plugin-dev' $TRAVIS_BRANCH
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
promise_run docker_build 'plugin-dev' $TRAVIS_BRANCH 'plugin-dev'
promise_then docker_push 'plugin-dev' $TRAVIS_BRANCH
promise_run docker_build 'plugin-dev' "$TRAVIS_BRANCH" 'plugin-dev'
promise_then docker_push 'plugin-dev' "$TRAVIS_BRANCH"

promise_then docker_push 'kuzzle' $TRAVIS_BRANCH

promise_run docker_build 'kuzzle' $TRAVIS_BRANCH-alpine 'kuzzle.alpine'
promise_then docker_push 'kuzzle' $TRAVIS_BRANCH-alpine

elif [[ "$TRAVIS_BRANCH" == "master" ]] || [[ "$TRAVIS_BRANCH" == *"-stable" ]]; then
RELEASE_TAG=$(cat package.json | grep version | head -1 | awk -F: '{ print $2 }' | sed 's/[\",]//g' | tr -d '[[:space:]]')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
RELEASE_TAG=$(cat package.json | grep version | head -1 | awk -F: '{ print $2 }' | sed 's/[\",]//g' | tr -d '[[:space:]]')
RELEASE_TAG=$(grep version package.json | head -1 | awk -F: '{ print $2 }' | sed 's/[\",]//g' | tr -d '[:space:]')

@@ -0,0 +1,22 @@
#!/bin/sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#!/bin/sh
#!/usr/bin/env sh

@Aschen
Copy link
Contributor Author

Aschen commented Jun 2, 2020

I will wait for Kaaf before finishing this PR

@Aschen Aschen marked this pull request as draft June 2, 2020 05:00
@Aschen
Copy link
Contributor Author

Aschen commented Jun 19, 2020

@benoitvidis About the compression, the decompression is very fast (~4-5s) and it allows to have only 19 mo transiting through the network instead of 94. It could be very useful for K8S environment where the pods have to download the new image before running for example.
Also keeping a lot of images on a server takes a lot of storage. You can clean it but it's not always the case and you also may want to keep previous versions to rollback quickly.

Also the way I copy Kuzzle still allows developers to add plugins in the plugins/* directories

I will leave the kuzzleio/kuzzle, kuzzleio/plugin-dev and kuzzleio/kuzzle:*-alpine without compression and I will create a new kuzzleio/kuzzle:*-everest image with the compression for those who want extra small images, what do you think?

Image sizes:

kuzzleio/kuzzle      plugin-dev      97624376beb0    53 seconds ago  385MB
kuzzleio/kuzzle      kuzzle          be537f211ae9    37 minutes ago  165MB
kuzzleio/kuzzle      kuzzle:alpine   d826c568d69a    10 minutes ago  140MB
kuzzleio/kuzzle      kuzzle:everest  67ef01c1d47e    9 minutes ago   19.5MB

@Aschen Aschen marked this pull request as ready for review June 19, 2020 09:21
@codecov
Copy link

codecov bot commented Jun 23, 2020

Codecov Report

Merging #1636 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1636   +/-   ##
=======================================
  Coverage   93.52%   93.52%           
=======================================
  Files         113      113           
  Lines        7197     7197           
=======================================
  Hits         6731     6731           
  Misses        466      466           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c81e218...1ea882f. Read the comment docs.

Comment on lines 3 to 4
set -e
set -x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
set -e
set -x
set -ex

@@ -1,8 +1,7 @@
#!/bin/sh

set -e

elastic_host=${kuzzle_services__storageEngine__client__node:-http://elasticsearch:9200}
set -x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set arguments could be combined

@@ -1,57 +1,21 @@
#!/bin/sh

set -e

elastic_host=${kuzzle_services__storageEngine__client__node:-http://elasticsearch:9200}
set -x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same


set -e

if [ ! -z "$WITHOUT_KUZZLE" ]; then
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ ! -z "$WITHOUT_KUZZLE" ]; then
if [ -n "$WITHOUT_KUZZLE" ]; then

@@ -0,0 +1,34 @@
#!/bin/sh

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing set -e?

@benoitvidis
Copy link
Contributor

@Aschen , I am still not sure about everest image.

  1. docker layers are already compressed. May be worth checking but my guess is there is gain to extra compress the layer for download time.
  2. because of the mergefs system, you will get 2 layers on the disk: the one with the compressed tar + the diff once is is extracted, which is also counter productive considering the goal. The docker image ls here needs to be compared vs the real size of the corresponding layers in /var/lib/docker/overlay/merged|diff
  3. the rollback issues you are mentioning should not have much impact in most cases, as only the application layer should be updated.

These 2 points + the already discussed UX issues make me really dubious about this image.

@Aschen
Copy link
Contributor Author

Aschen commented Jun 24, 2020

@benoitvidis I replaced the image by a scratch image without the compression (size 94 Mo)

Docker compress layers with Gzip which is not as efficient as Lzma, especially for binary files:
image

The difference between compressed application layers is not that huge (13Mo gz vs 9.7 Mo lzma) so I agree that for subsequent download of the same image the gain won't be huge:
image

I still think the compression is useful in a K8S environment because the pods has to download the image whenever they are spawned so it allows to save bandwidth, which is quite expensive in most of the cloud providers.

But I understand your point of view and we need those smaller image so LGTM

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@Aschen Aschen merged commit c48fa76 into master Jun 26, 2020
@Aschen Aschen deleted the reduce-production-image-size branch June 26, 2020 09:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants