Programming Parallel Computers

Aalto 2024

CP5: fast GPU solution ★★

You need to log in to make submissions.

What you will need to do in this task

Please read the general instructions for this exercise first. Here are the additional instructions specific to this task:

Using all resources that you have in the GPU, solve the task as fast as possible. In this task, you are permitted to use single-precision floating point numbers.

What I will try to do with your code

I will first run all kinds of tests to see that your code works correctly. You can try it out locally by running ./grading test, but please note that your code has to compile and work correctly not only on your own computer but also on our machines.

If all is fine, I will run the benchmarks. You can try it out on your own computer by running ./grading benchmark, but of course the precise running time on your own computer might be different from the performance on our grading hardware.

Benchmarks

Name Parameters
benchmarks/1 nx = 1000, ny = 1000
the input contains 1000 × 1000 pixels, and the output should contain 1000 × 1000 pixels
benchmarks/2a nx = 1000, ny = 4000
the input contains 4000 × 1000 pixels, and the output should contain 4000 × 4000 pixels
benchmarks/2b nx = 1000, ny = 4000
the input contains 4000 × 1000 pixels, and the output should contain 4000 × 4000 pixels
benchmarks/2c nx = 999, ny = 3999
the input contains 3999 × 999 pixels, and the output should contain 3999 × 3999 pixels
benchmarks/2d nx = 1001, ny = 4001
the input contains 4001 × 1001 pixels, and the output should contain 4001 × 4001 pixels
benchmarks/3 nx = 6000, ny = 6000
the input contains 6000 × 6000 pixels, and the output should contain 6000 × 6000 pixels
benchmarks/4a nx = 9000, ny = 9000
the input contains 9000 × 9000 pixels, and the output should contain 9000 × 9000 pixels
benchmarks/4b nx = 12000, ny = 12000
the input contains 12000 × 12000 pixels, and the output should contain 12000 × 12000 pixels

Grading

In this task your submission will be graded using benchmarks/4b: the input contains 12000 × 12000 pixels, and the output should contain 12000 × 12000 pixels.

The point thresholds are as follows. If you submit your solution no later than on Sunday, 26 May 2024, at 23:59:59 (Helsinki), your score will be:

Running timePoints
≤ 8.000 sec 1
≤ 6.000 sec 2
≤ 5.000 sec 3
≤ 4.000 sec 4
≤ 3.000 sec 5
≤ 2.000 sec 6
≤ 1.500 sec 7
≤ 1.000 sec 8
≤ 0.700 sec 9
≤ 0.400 sec 10

If you submit your solution after the deadline, but before the course ends on Sunday, 02 June 2024, at 23:59:59 (Helsinki), your score will be:

Running timePoints
≤ 8.000 sec 1
≤ 4.000 sec 2
≤ 1.500 sec 3
≤ 1.000 sec 4
≤ 0.400 sec 5

Contest

Your submissions to this task will also automatically take part in the contest, and you can receive up to 2 additional points if your code is among the fastest solutions this year!

Running timeExtra points
≤ 1.20 × fastest 1
≤ 1.05 × fastest 2