Fast Trigonometry Calculations

Your task is to create a program which can calculate the sine, cosine and tangent of an angle in degrees.

Rules

No built-in trigonometry functions (not even secant, cosecant and cotangent if your language has them).
You may use lookup tables, but their total size must not exceed 3000 members (for all of the three operations put together). Please make it read the tables from a file (e.g. trig.lookup) so they don't confuse the code.
No network access.
You must correctly round your output as explained below. Don't use floor or ceiling.
You may use any method to calculate the values, for example continued fractions, as long as it is correct to 7 significant figures.
Your code must be able to time itself. Exclude the file I/O operations from your time - so just time the function(s) that do(es) the trig and any rounding.
I must be able to run your code. Please post a link to a freely available compiler/interpreter and give the instructions needed to compile/run the code (e.g. what options to pass to GCC).
Standard loopholes apply.

Input Format

Read from a file called trig.in unless your language doesn't support file I/O.
Angles are between 0 and 360 inclusive.
Input will consist of angles to ten significant figures in decimal digits, separated by new lines. For example:

90.00000000
74.54390000
175.5000000

Output Format

For each angle supplied, you must output its sine, cosine and tangent to 7 significant figures, separated by spaces, on a single line. Use "scientific notation", e.g. 1.745329E-5 for tan 0.001 or 1.000000E+0 for sin 90.
Denote infinity or NaN by n, for example the output for 90.00000000 should be 1.000000 0.000000 n.
If the input is three angles separated by newlines, your output should consist of three lines, each containing the sine, cosine and tangent.
You may not output anything else.
Output to a file called trig.out unless your language does not support file I/O.

Scoring

fastest-code. The challenge is to write a program which computes these three values as quickly as possible. Fastest time wins.
Everyone will receive the same test input of many angles.
Times will be recorded on my machine.
Your score is the average of three runs on the same input (you can't save anything in between runs obviously).
Compiling time not included. This challenge is more about the method used than the language. (If someone could point me to how I would exclude compilation time for languages such as Java, I'd be very grateful)
My machine is an Ubuntu 14.04 install. The processor's stats are on Pastebin (obtained by running cat /proc/cpuinfo).
I will edit your time into your answer when I've tested it.

user16402

Posted 2014-06-04T16:43:53.507

Reputation:

If you didn't have to significant figures precision requirements, I would use Bhaskara’s approximation for the Sine http://scholarworks.umt.edu/cgi/viewcontent.cgi?article=1313&context=tme — an incredible feat for the 7th century! The maximum error is −0.001631765 (downwards) and 0.0013436967 (upwards).

– sergiol – 2017-03-07T00:08:26.800

I can't use scientific notation and have output of 1.000000 0.000000 n at the same time. Also why are you so picky about how NaN is output? – gggg – 2018-03-07T16:57:33.373

1Are we supposed to time only the trig calculations, or include the io in the timing? – gggg – 2018-03-07T20:03:45.060

Does the output have to be on a single line? It looks so pretty when it's formated with an enter key... Also, is there a specific date at which a winner is picked? – Ephraim – 2014-06-09T08:35:41.867

@Ephraim what do you mean by formatted with an enter key? no, there isn't a specific date. I really need to test all these solutions, but I haven't made the test input yet ;( – None – 2014-06-09T13:04:36.673

@professorfish - see the output in my answer. Every sin, cos and tan is on a new line. Do I need to change it to output the answers onto a single line? – Ephraim – 2014-06-09T15:59:41.097

2@Ephraim The output format really doesn't matter (this isn't code-golf) as long as it outputs the sin cos and tan for every angle and they are separate – None – 2014-06-09T16:02:50.067

Does the output have to be rounded to 7 significant figures? or can it just be correct to that level? – Οurous – 2014-06-12T09:15:02.047

@Ourous it has to be rounded – None – 2014-06-12T10:32:04.970

Answers

Fortran 90

I employ the CORDIC method with a pre-tabulated array of 60 arctan values (see Wiki article for details on why that's necessary).

This code requires a file, trig.in, with all the values on newlines to be stored in the same folder as the Fortran executable. Compiling this is,

gfortran -O3 -o file file.f90

where file is whatever filename you give it (probably SinCosTan.f90 would be easiest, though it's not necessary to match program name and file name). If you have the Intel compiler, I'd recommend using

ifort -O3 -xHost -o file file.f90

as the -xHost (which doesn't exist for gfortran) provides higher level optimizations available to your processor.

My test runs were giving me about 10 microseconds per calculation when testing 1000 random angles using gfortran 4.4 (4.7 or 4.8 is available in the Ubuntu repos) and around 9.5 microseconds using ifort 12.1. Testing only 10 random angles will result in an indeterminable time using Fortran routines, as the timing routine is accurate to the millisecond and simple math says it should take 0.100 milliseconds to run all 10 numbers.

EDIT Apparently I was timing IO which (a) made the timing longer than necessary and (b) is contrary to bullet #6. I've updated the code to reflect this. I've also discovered that using a kind=8 integer with the intrinsic subroutine system_clock gives microsecond accuracy.

With this updated code, I am now computing each set of values of the trigonometric functions in about 0.3 microseconds (the significant digits in the end vary run-to-run, but it's consistently hovering near 0.31 us), a significant reduction from the previous iteration that timed IO.

program SinCosTan
   implicit none
   integer, parameter :: real64 = selected_real_kind(15,307)
   real(real64), parameter :: PI  = 3.1415926535897932384626433832795028842
   real(real64), parameter :: TAU = 6.2831853071795864769252867665590057684
   real(real64), parameter :: half = 0.500000000000000000000_real64
   real(real64), allocatable :: trigs(:,:), angles(:)
   real(real64) :: time(2), times, b
   character(len=12) :: tout
   integer :: i,j,ierr,amax
   integer(kind=8) :: cnt(2)

   open(unit=10,file='trig.out',status='replace')
   open(unit=12,file='CodeGolf/trig.in',status='old')
! check to see how many angles there are
   i=0
   do
      read(12,*,iostat=ierr) b
      if(ierr/=0) exit
      i=i+1
   enddo !- 
   print '(a,i0,a)',"There are ",i," angles"
   amax = i

! allocate array
   allocate(trigs(3,amax),angles(amax))

! rewind the file then read the angles into the array
   rewind(12)
   do i=1,amax
      read(12,*) angles(i)
   enddo !- i

! compute trig functions & time it
   times = 0.0_real64
   call system_clock(cnt(1)) ! <-- system_clock with an 8-bit INT can time to us
   do i=1,amax
      call CORDIC(angles(i), trigs(:,i), 40)
   enddo !- i
   call system_clock(cnt(2))
   times = times + (cnt(2) - cnt(1))

! write the angles to the file
   do i=1,amax
      do j=1,3
         if(trigs(j,i) > 1d100) then
            write(tout,'(a1)') 'n'
         elseif(abs(trigs(j,i)) > 1.0) then
            write(tout,'(f10.6)') trigs(j,i)
         elseif(abs(trigs(j,i)) < 0.1) then
            write(tout,'(f10.8)') trigs(j,i)
         else
            write(tout,'(f9.7)') trigs(j,i)
         endif
         write(10,'(a)',advance='no') tout
      enddo !- j
      write(10,*)" "
   enddo !- i

   print *,"computation took",times/real(i,real64),"us per angle"
   close(10); close(12)
 contains
   !> @brief compute sine/cosine/tangent
   subroutine CORDIC(a,t,n)
     real(real64), intent(in) :: a
     real(real64), intent(inout) :: t(3)
     integer, intent(in) :: n
! local variables
     real(real64), parameter :: deg2rad = 1.745329252e-2
     real(real64), parameter :: angles(60) = &
       [ 7.8539816339744830962e-01_real64, 4.6364760900080611621e-01_real64, &
         2.4497866312686415417e-01_real64, 1.2435499454676143503e-01_real64, &
         6.2418809995957348474e-02_real64, 3.1239833430268276254e-02_real64, &
         1.5623728620476830803e-02_real64, 7.8123410601011112965e-03_real64, &
         3.9062301319669718276e-03_real64, 1.9531225164788186851e-03_real64, &
         9.7656218955931943040e-04_real64, 4.8828121119489827547e-04_real64, &
         2.4414062014936176402e-04_real64, 1.2207031189367020424e-04_real64, &
         6.1035156174208775022e-05_real64, 3.0517578115526096862e-05_real64, &
         1.5258789061315762107e-05_real64, 7.6293945311019702634e-06_real64, &
         3.8146972656064962829e-06_real64, 1.9073486328101870354e-06_real64, &
         9.5367431640596087942e-07_real64, 4.7683715820308885993e-07_real64, &
         2.3841857910155798249e-07_real64, 1.1920928955078068531e-07_real64, &
         5.9604644775390554414e-08_real64, 2.9802322387695303677e-08_real64, &
         1.4901161193847655147e-08_real64, 7.4505805969238279871e-09_real64, &
         3.7252902984619140453e-09_real64, 1.8626451492309570291e-09_real64, &
         9.3132257461547851536e-10_real64, 4.6566128730773925778e-10_real64, &
         2.3283064365386962890e-10_real64, 1.1641532182693481445e-10_real64, &
         5.8207660913467407226e-11_real64, 2.9103830456733703613e-11_real64, &
         1.4551915228366851807e-11_real64, 7.2759576141834259033e-12_real64, &
         3.6379788070917129517e-12_real64, 1.8189894035458564758e-12_real64, &
         9.0949470177292823792e-13_real64, 4.5474735088646411896e-13_real64, &
         2.2737367544323205948e-13_real64, 1.1368683772161602974e-13_real64, &
         5.6843418860808014870e-14_real64, 2.8421709430404007435e-14_real64, &
         1.4210854715202003717e-14_real64, 7.1054273576010018587e-15_real64, &
         3.5527136788005009294e-15_real64, 1.7763568394002504647e-15_real64, &
         8.8817841970012523234e-16_real64, 4.4408920985006261617e-16_real64, &
         2.2204460492503130808e-16_real64, 1.1102230246251565404e-16_real64, &
         5.5511151231257827021e-17_real64, 2.7755575615628913511e-17_real64, &
         1.3877787807814456755e-17_real64, 6.9388939039072283776e-18_real64, &
         3.4694469519536141888e-18_real64, 1.7347234759768070944e-18_real64]
     real(real64), parameter :: kvalues(33) = &
       [ 0.70710678118654752440e+00_real64, 0.63245553203367586640e+00_real64, &
         0.61357199107789634961e+00_real64, 0.60883391251775242102e+00_real64, &
         0.60764825625616820093e+00_real64, 0.60735177014129595905e+00_real64, &
         0.60727764409352599905e+00_real64, 0.60725911229889273006e+00_real64, &
         0.60725447933256232972e+00_real64, 0.60725332108987516334e+00_real64, &
         0.60725303152913433540e+00_real64, 0.60725295913894481363e+00_real64, &
         0.60725294104139716351e+00_real64, 0.60725293651701023413e+00_real64, &
         0.60725293538591350073e+00_real64, 0.60725293510313931731e+00_real64, &
         0.60725293503244577146e+00_real64, 0.60725293501477238499e+00_real64, &
         0.60725293501035403837e+00_real64, 0.60725293500924945172e+00_real64, &
         0.60725293500897330506e+00_real64, 0.60725293500890426839e+00_real64, &
         0.60725293500888700922e+00_real64, 0.60725293500888269443e+00_real64, &
         0.60725293500888161574e+00_real64, 0.60725293500888134606e+00_real64, &
         0.60725293500888127864e+00_real64, 0.60725293500888126179e+00_real64, &
         0.60725293500888125757e+00_real64, 0.60725293500888125652e+00_real64, &
         0.60725293500888125626e+00_real64, 0.60725293500888125619e+00_real64, &
         0.60725293500888125617e+00_real64 ]
    real(real64) :: beta, c, c2, factor, poweroftwo, s
    real(real64) :: s2, sigma, sign_factor, theta, angle
    integer :: j

! scale to radians
    beta = a*deg2rad
! ensure angle is shifted to appropriate range
    call angleShift(beta, -PI, theta)
! check for signs
    if( theta < -half*PI) then
       theta = theta + PI
       sign_factor = -1.0_real64
    else if( half*PI < theta) then
       theta = theta - PI
       sign_factor = -1.0_real64
    else
       sign_factor = +1.0_real64
    endif

! set up some initializations...    
    c = 1.0_real64
    s = 0.0_real64
    poweroftwo = 1.0_real64
    angle = angles(1)

! run for 30 iterations (should be good enough, need testing)
    do j=1,n
       sigma = merge(-1.0_real64, +1.0_real64, theta <  0.0_real64)
       factor = sigma*poweroftwo

       c2 = c - factor*s
       s2 = factor*c + s
       c = c2
       s = s2
! update remaining angle
       theta = theta - sigma*angle

       poweroftwo = poweroftwo*0.5_real64
       if(j+1 > 60) then
          angle = angle * 0.5_real64
       else
          angle = angles(j+1)
       endif
    enddo !- j

    if(n > 0) then
       c = c*Kvalues(min(n,33))
       s = s*Kvalues(min(n,33))
    endif
    c = c*sign_factor
    s = s*sign_factor

    t = [s, c, s/c]
   end subroutine CORDIC

   subroutine angleShift(alpha, beta, gamma)
     real(real64), intent(in) :: alpha, beta
     real(real64), intent(out) :: gamma
     if(alpha < beta) then
        gamma = beta - mod(beta - alpha, TAU) + TAU
     else
        gamma = beta + mod(alpha - beta, TAU) 
     endif
   end subroutine angleShift

end program SinCosTan

Kyle Kanos

Posted 2014-06-04T16:43:53.507

Reputation: 4 270

@KyleKanos How does the timing compare with the native Fortran functions if you do this for say, 10^6 random angles? You said it was indeterminate, but I'm curious, and maybe doing a lot more iterations would give better results. – jvriesem – 2015-09-23T21:42:05.110

2Finally, someone used CORDIC :D – qwr – 2014-06-09T21:26:47.810

1I think that "-march=native" is the gfortran flag corresponding to ifort "-xHost". Also, I believe Intel has -O3 set to a more aggressive mode than gfortran, so you could try gfortran with "-O3 -fno-protect-parens -fstack-arrays" to see if it helps. – semi-extrinsic – 2014-06-12T08:07:23.010

Also, you are timing the IO part as well, since you do read inside the loop. The rules specifically say you should not time IO. Fixing this gave quite the speedup on my computer: 0.37 microseconds per value, vs. 6.94 for your posted code. Also, the posted code does not compile, there's a trailing comma on line 100. There is also an error on line 23: trigs(i) should be just trigs. This makes the posted code segfault. – semi-extrinsic – 2014-06-12T08:50:53.383

Improved version here: http://pastebin.com/freiHTfx

– semi-extrinsic – 2014-06-12T09:00:03.987

Update re: compiler options: -march and -fno-protect-parens did nothing, but -fstack-arrays shaved off another 0.1 microseconds per value. "ifort -O3 -xHost" is, remarkably, almost 2x slower than "gfortran -O3 -fstack-arrays": 0.55 vs. 0.27 – semi-extrinsic – 2014-06-12T09:03:25.773

@semi-extrinsic: Thanks for the catches. I was formatting some things between my copy & the one posted here and must have missed some things. – Kyle Kanos – 2014-06-12T13:24:40.803

Strangely, my gfortran does not accept -fstack-arrays as an acceptable compiler flag (I have 4.4 natively, but do have a 4.8 installed now; it works on neither). – Kyle Kanos – 2014-06-12T13:37:32.787

That's strange. I have 4.8.2, and compiling with "gfortran -O3 -march=native -fstack-arrays file.f90" works perfectly. – semi-extrinsic – 2014-06-15T20:17:58.093

Python 2.7.x or Java (Take your pick)

A free Python interpreter can be downloaded from here.
A free Java interpreter can be downloaded from here.

The program can take input both from a file named trig.in located in the same directory as the program file. Input is separated by newlines.

I originally did this in python because - well, I love python. But, since I wanna try to win as well, I rewrote it in java afterwards...

Python Version: I got about 21µs per run on my computer. I got about 32µs when running it on IDEone.

Java version: I get about 0.4µs per run on my computer, and 1.8µs on IDEone.

Computer specs:

Windows 8.1 update 1 64-bit with intel core i7-3632QM - 2.2GHz)

Test:

Time per run" is the cumulative time it takes to calculate the sin, cos, and tan all of the input angles.

The test input used for both is as follows:

90.00000000  
74.54390000  
175.5000000  
3600000.000

About The Code:
The premise for this program was to estimate sin and cos using their Taylor polynomials with 14 terms, which was what I calculated was necessary to have an error estimate of less than 1e-8. However I found it was faster to calculate sin than cos, so decided to instead calculate cos by using cos=sqrt(1-sin^2)

Python Version:

import math
import timeit
import sys
import os
from functools import partial

#Global Variabls
pi2 = 6.28318530718
piover2 = 1.57079632679

#Global Lists
factorialList = [1.0,
                 -6.0,
                 120.0,
                 -5040.0,
                 362880.0,
                 -39916800.0,
                 6227020800.0,
                 -1307674368000.0,
                 355687428096000.0,
                 -121645100408832000.0,
                 51090942171709440000.0,
                 -25852016738884976640000.0,
                 15511210043330985984000000.0,
                 -10888869450418352160768000000.0,
                 8841761993739701954543616000000.0]

#simplifies angles and converts them to radians
def torad(x):  
    rev = float(x)/360.0
    if (rev>1) or (rev<0):
        return (rev - math.floor(rev))*pi2
    return rev*pi2

def sinyield(x):
    squared = x*x
    for n in factorialList:
        yield x/n
        x*=squared

def tanfastdivide(sin, cos):
    if (cos == 0):
        return "infinity"  
    return sin/cos

#start calculating sin cos and tan
def calcyield(outList):
    for angle in outList[0]:
        angle = torad(angle)
        sin = round(math.fsum(sinyield(angle)), 7)
        cos=math.copysign(math.sqrt(1-(sin*sin)),(piover2-angle))
        yield sin
        yield cos
        yield tanfastdivide(sin, cos) #tan

def calculations(IOList):
    calcyieldgen = calcyield(IOList)
    for angle in IOList[0]:
        IOList[1].append(next(calcyieldgen))
        IOList[2].append(next(calcyieldgen))
        IOList[3].append(next(calcyieldgen))
    return IOList

#Begin input from file
def ImportFile():
    try:
        infile = open("trig.in", "r")
    except:
        infile = sys.stdin
    inList = [[], [], [], []]

    #take input from file
    for line in infile:
        inList[0].extend([float(line)])
    return inList

#Begin output to file
def OutputResults(outList):
    try:
        outfile = open("trig.out", "w")
        PrintOutput(outList, outfile)    
    except:
        print 'Failed to write to file. Printing to stdout instead...'
    finally:
        PrintOutput(outList, sys.stdout)

def PrintOutput(outList, outfile):
    #outList[0][i] holds original angles
    #outList[1][i] holds sin values
    #outList[2][i] holds cos values
    #outList[3][i] holds tan values
    outfile.write('-----------------------------------------------------\n')
    outfile.write('                    TRIG RESULTS                     \n')
    outfile.write('-----------------------------------------------------\n')
    for i in range(len(outList[0])):
        if (i):
            outfile.write('\n')
        outfile.write("For angle: ")
        outfile.write(str(outList[0][i]))
        outfile.write('\n    ')
        outfile.write("Sin: ")
        outfile.write(str('%.7E' % float(outList[1][i])))
        outfile.write('\n    ')
        outfile.write("Cos: ")
        outfile.write(str('%.7E' % float(outList[2][i])))
        outfile.write('\n    ')
        outfile.write("Tan: ")
        outfile.write(str('%.7E' % float(outList[3][i])))


#Run the Program first
inList = ImportFile()
OutputResults(calculations(inList))

#Begin Runtime estimation
def timeTest(IOList):
    for output in calcyield(IOList):
        pass
def baselined(inList):
    for x in inList:
        pass

totime = timeit.Timer(partial(timeTest, inList))
baseline = timeit.Timer(partial(baselined, inList))
print '\n-----------------------------------------------------'
print '                    TIME RESULTS:                    '
print '-----------------------------------------------------'
OverheadwithCalcTime =  min(totime.repeat(repeat=10, number=10000))
Overhead = min(baseline.repeat(repeat=1, number=10000))
estimatedCalcTime = (OverheadwithCalcTime - Overhead)
estimatedTimePerAngle = estimatedCalcTime/len(inList)
estimatedTimePerCalc = estimatedTimePerAngle/3
print ' Estimated CalcTime+Overhead:.....', '%.10f' % (OverheadwithCalcTime*100), 'µsec'
print ' Estimated Overhead Time:..........', '%.10f' % (Overhead*100), 'µsec'
print ''
print ' Estimated CalcTime/Run:..........', '%.10f' % (estimatedCalcTime*100), 'µsec'
print ' Estimated CalcTime/Angle:.........', '%.10f' % (estimatedTimePerAngle*100), 'µsec'
print ' Estimated CalcTime/Cacluation:....', '%.10f' % (estimatedTimePerCalc*100), 'µsec'
print '-----------------------------------------------------'
print "                   COOL, IT WORKED!                  "
print '-----------------------------------------------------'

Java Version:

import java.io.FileNotFoundException;
import java.io.File;
import java.io.PrintStream;
import java.text.DecimalFormat;
import java.text.NumberFormat;
import java.util.ArrayList;
import java.util.Scanner;
import java.lang.Math;

class Trig {
   /**
    *Global Variables
    **/
    static final double pi2 = 6.28318530718;
    public long totalTime = 0L;
    static final double piover2 = 1.57079632679;
    static final double plusinfty = Double.POSITIVE_INFINITY;
    static final double minusinfty = Double.NEGATIVE_INFINITY;
    static final double factoriallist[] =
                             new double[]{
                         -6.0,120.0,-5040.0,362880.0,-39916800.0,
                         6227020800.0,-1307674368000.0,355687428096000.0,
                        -121645100408832000.0,51090942171709440000.0,
                        -25852016738884976640000.0,
                         15511210043330985984000000.0,
                        -10888869450418352160768000000.0,
                         8841761993739701954543616000000.0
                         };
//Begin Program
    public static void main(String[] args) {
        Trig mytrig = new Trig();
        double[] input = mytrig.getInput();
        double[][] output = mytrig.calculate(input);
        mytrig.OutputResults(output);
        Trig testIt = new Trig();
        testIt.timeIt(input);
    }

//Begin Timing
    public void timeIt(double[] input) {
        double overhead = 0L;
        totalTime = 0L;

        for (int i = 0; i < 1000001; i++) {
            calculate(input);
            if (i == 0) {
                overhead = totalTime;
                totalTime = 0L;
            }
        }
        double averageTime = ((Double.valueOf(totalTime-overhead))/1000000.0);
        double perAngleTime = averageTime/input.length;
        double perOpperationTime = perAngleTime/3;
        NumberFormat formatter = new DecimalFormat("0000.0000");
        System.out.println("\n-----------------------------------------------------");
        System.out.println("                    TIME RESULTS:                    ");
        System.out.println("-----------------------------------------------------");
        System.out.println("Average Total  Runtime:.......... " + formatter.format(averageTime) + " nsec");
        System.out.println("                                = " + formatter.format(averageTime/1000) + " usec\n");
        System.out.println("Average Runtime Per Angle:....... " + formatter.format(perAngleTime) + " nsec");
        System.out.println("                                = " + formatter.format(perAngleTime/1000) + " usec\n");
        System.out.println("Average Runtime Per Opperation:.. " + formatter.format(perOpperationTime) + " nsec");
        System.out.println("                                = " + formatter.format(perOpperationTime/1000) + " usec");
    }

//Begin Input
    double[] getInput() {
        Scanner in;
        ArrayList<Double> input = new ArrayList<Double>();
        try {
            in = new Scanner(new File("trig.in"));
        } catch (FileNotFoundException e) {
            new FileNotFoundException("Failed to read from file. Reading from stdin instead...").printStackTrace();
            in= new Scanner(System.in);
        }
        while (in.hasNextLine()) {
            Double toadd = Double.parseDouble(in.nextLine());
            input.add(toadd);   
        }
        in.close();
        double[] returnable = new double[input.size()];
        for(int i = 0; i < input.size(); i++) {returnable[i] = input.get(i);}
        return returnable;
    }

//Begin OutputStream Choice
    void OutputResults(double[][] outList) {
        PrintStream out;
        try {
            out = new PrintStream("trig.out");
            PrintOutput(outList, out);
            PrintOutput(outList, System.out);
        } catch (FileNotFoundException e) {
            new FileNotFoundException("Failed to write to file. Printing to stdout instead...").printStackTrace();
            PrintOutput(outList, System.out);
        }
    }

//Begin Output
    static void PrintOutput(double[][] outList, PrintStream out) {
        /**
         *outList[0][i] holds original angles
         *outList[1][i] holds sin values
         *outList[2][i] holds cos values
         *outList[3][i] holds tan values
         */
        NumberFormat formatter = new DecimalFormat("0.0000000E0");
        out.println("-----------------------------------------------------");
        out.println("                    TRIG RESULTS                     ");
        out.println("-----------------------------------------------------");
        for (int i=0; i<outList[0].length; i++) {
            out.println("For Angle: " + outList[0][i]);

            out.println("      sin: " + formatter.format(outList[1][i]));
            out.println("      cos: " + formatter.format(outList[2][i]));
            if (Double.valueOf(outList[3][i]).isInfinite() || Double.valueOf(outList[3][i]).isNaN()) {
                out.println("      tan: " + outList[3][i]);
            }
            else out.println("      tan: " + formatter.format(outList[3][i]));
        }
        if (out != System.out) out.close();
    }

//Begin Calculations
    double[][] calculate(double[] IOList) {
        double[][] outList = new double[4][IOList.length];
        double sin;
        double cos;
        double tan;
        double rads;
        int i = 0;
        long calctime = 0L;
        long startTime;
        long endTime;
        for (double input : IOList) {
            startTime = System.nanoTime();
            rads = toRad(input);
            sin=sin(rads);
            cos = ((piover2-rads)>=0) ? Math.sqrt((1.0-(sin*sin))) : -Math.sqrt((1.0-(sin*sin)));
            tan = (cos!=0.0d) ? sin/cos : ((sin>0) ? plusinfty : minusinfty);
            endTime = System.nanoTime();
            calctime = calctime + endTime - startTime;
            outList[0][i] = input;
            outList[1][i] = sin;
            outList[2][i] = cos;
            outList[3][i] = tan;
            i++;
        }
        totalTime = totalTime + calctime;
        return outList;
    }

//Convert Degrees to Radians
    double toRad(double deg) {
        double rev = deg/360.0;
        return (rev>1 || rev<0) ? Math.abs(rev - ((int)rev))*pi2 : rev*pi2;
    }

//Get sin
    double sin(double angle) {
        double sqr = angle*angle;
        double value = angle;
        for (double fct : factoriallist) {
            value += ((angle*=sqr)/fct);
        }
        return ((long)((value + Math.copySign(0.0000000005d, value))*10000000.0))/10000000.0;
    }   
}

Ephraim

Posted 2014-06-04T16:43:53.507

Reputation: 143

Your cosines are wrong for 180<x<360, and the program fails completely on 270. – Οurous – 2014-06-12T10:17:46.557

@Ourous - I modified it, so it should work now in both languages. – Ephraim – 2014-06-12T18:11:49.813

You cos calculation is overkill, I'd just do sin(x+90degrees) – Skizz – 2014-06-13T08:17:05.483

@Skizz - In my program, I use the word sin as both a function, and a variable. I thought it would be faster not to have to pass something to thesin() a second time, but I'll compare the two to see if that is really the case. Was your impression that copySign() function is slower that adding things up such as in my sin() function? – Ephraim – 2014-06-13T14:53:57.163

Ah, I see you're doing sin and cos at the same time. My comment would only really be valid if you were doing sin or cos. – Skizz – 2014-06-13T15:01:41.897

Cobra

class Trig
    const mod as float = 0.0174532925199433f #0.0174532925199432957692369076848861271344287188854172f
    var time as System.Diagnostics.Stopwatch = System.Diagnostics.Stopwatch()
    var file as String[] = File.readAllLines('trig.in')
    var sin_out as float[] = float[](1)
    var cos_out as float[] = float[](1)
    var tan_out as float[] = float[](1)
    def main
        .compute(@[1f])
        .time.reset
        input = float[](.file.length)
        for num, line in .file.numbered, input[num] = float.parse(line)
        .compute(input)
        for num in .file.length, .file[num] = (.sin_out[num].toString('0.000000E+0') + ' ' + .cos_out[num].toString('0.000000E+0') + ' ' + .tan_out[num].toString('0.000000E+0'))
        File.writeAllLines('trig.out', .file)
        print .time.elapsed
    def compute(list as float[])
        .sin_out = float[](list.length)
        .cos_out = float[](list.length)
        .tan_out = float[](list.length)
        .time.start
        for index in list.length
            degrees as float = list[index]
            #add `degrees %= 360` for numbers greater than 360
            rad as float = sin as float = degrees * .mod
            two as float = rad * rad
            sin -= (rad *= two) / 6
            sin += (rad *= two) / 120
            sin -= (rad *= two) / 5040
            sin += (rad *= two) / 362880
            sin -= (rad *= two) / 39916800
            sin += (rad *= two) / 6227020800
            sin -= (rad *= two) / 1307674368000
            sin += (rad *= two) / 355687428096000
            sin -= (rad *= two) / 121645100408832000
            sin += (rad *= two) / 51090942171709440000f
            sin -= (rad *= two) / 25852016738884976640000f
            sin += (rad *= two) / 15511210043330985984000000f
            sin -= (rad *= two) / 10888869450418352160768000000f
            sin += (rad *= two) / 8841761993739701954543616000000f
            cos as float = (1 - (sin * sin)).sqrt * ((degrees - 180).abs - 90).sign
            if cos.isNaN, cos = 0
            .tan_out[index] = Math.round((sin / cos) * 10000000) / 10000000
            .sin_out[index] = Math.round(sin * 10000000) / 10000000
            .cos_out[index] = Math.round(cos * 10000000) / 10000000
        .time.stop

Compile it with cobra filename -turbo

Tests: AMD FX6300 @5.1GHz

The 360 * 10000 test used by the C answer runs in 365ms (vs 190ms)
The 4-entry test used by the Python and Java answers runs in 0.32µs (vs 30µs, 3µs)
The 1000 random angle test used by the Fortran answer runs at 100ns per angle (vs 10µs)

Οurous

Posted 2014-06-04T16:43:53.507

Reputation: 7 916

2So apart from giving the wrong answer and being too slow, it's fine? :) – None – 2014-06-10T06:43:17.700

@Lembik It's now fixed. – Οurous – 2014-06-10T08:42:09.930

4do you realize you basically just wrote the same program in a different snake? – Ephraim – 2014-06-10T16:39:14.783

Octave (or Matlab) & C

A bit of a complicated build process, but kind of a novel approach and the results were encouraging.

The approach is to generate approximating quadratic polynomials for each degree. So degree = [0, 1), degree = [1, 2), ..., degree = [359, 360) will each have a different polynomial.

Octave - building part

Octave is publicly available - Google download octave.

This determines the best fit quadratic polynomial for every degree.

Save as build-fast-trig.m:

format long;
for d = 0:359
    x = (d-1):0.1:(d+1);
    y = sin(x / 360 * 2 * pi);
    polyfit(x, y, 2)
endfor

C - building part

This converts doubles in text format to native binary format on your system.

Save as build-fast-trig.c:

#include <stdio.h>

int main()
{
    double d[3];

    while (scanf("%lf %lf %lf", d, d + 1, d + 2) == 3)
        fwrite(d, sizeof(double), 3, stdout);

    return 0;
}

Compile:

gcc -o build-fast-trig build-fast-trig.c

Generating the coefficients file

Run:

octave build-fast-trig.m | grep '^ ' | ./build-fast-trig > qcoeffs.dat

Now we have qcoeffs.dat as the data file to use for the actual program.

C - fast-trig part

Save as fast-trig.c:

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <time.h>

#define INPUT    "qcoeffs.dat"

#define DEGREES    360

typedef struct {double a, b, c;} QCOEFFS;

double normalize(double d)
{
    if (d < 0.0)
        d += ceil(d / -(double)DEGREES) * (double)DEGREES;

    if (d >= (double)DEGREES)
        d -= floor(d / (double)DEGREES) * (double)DEGREES;

    return d;
}

int main()
{
    FILE *f;
    time_t tm;
    double d;
    QCOEFFS qc[DEGREES];

    if (!(f = fopen(INPUT, "rb")) || fread(qc, sizeof(QCOEFFS), DEGREES, f) < DEGREES)
    {
        fprintf(stderr, "Problem with %s - aborting.", INPUT);
        return EXIT_FAILURE;
    }
    fclose(f);

    tm = -clock();

    while (scanf("%lf", &d) > 0)
    {
        int i;
        double e, f;

        /* sin */
        d = normalize(d);
        i = (int)d;
        e = (qc[i].a * d + qc[i].b) * d + qc[i].c;

        /* cos */
        d = normalize((double)DEGREES / 4.0 - d);
        i = (int)d;
        f = (qc[i].a * d + qc[i].b) * d + qc[i].c;

        /* tan = sin / cos */

        /* output - format closest to specs, AFAICT */
        if (d != 0.0 && d != 180.0)
            printf("%.6e %.6e %.6e\n", e, f, e / f);
        else
            printf("%.6e %.6e n\n", e, f);
    }

    tm += clock();

    fprintf(stderr, "time: %.3fs\n", (double)tm/(double)CLOCKS_PER_SEC);    

    return EXIT_SUCCESS;
}

Compile:

gcc -o fast-trig fast-trig.c -lm

Run:

./fast-trig < trig.in > trig.out

It will read from trig.in, save to trig.out and print to console the elapsed time with millisecond precision.

Depending on the testing methods used it may fail on certain input, e.g.:

$ ./fast-trig 
0
-6.194924e-19 1.000000e+00 -6.194924e-19

The correct output should be 0.000000e+00 1.000000e+00 0.000000e+00. If the results are validated using strings, the input will fail, if they are validated using absolute error, e.g. fabs(actual - result) < 1e-06, the input will pass.

The maximum absolute error for sin and cos was ≤ 3e-07. For tan, because the result isn't limited to ± 1 and you can divide a relatively large number by a relatively small number, the absolute error could be larger. From -1 ≤ tan(x) ≤ +1, the maximum absolute error was ≤ 4e-07. For tan(x) > 1 and tan(x) < -1, the maximum relative error, e.g. fabs((actual - result) / actual) was usually < 1e-06 until you get in the area of (90 ± 5) or (270 ± 5) degrees, then the error gets worse.

In testing, the average time per single input was (1.053 ± 0.007) µs, which on my machine was about 0.070 µs faster than native sin and cos, tan being defined the same way.

user15259

Posted 2014-06-04T16:43:53.507

Reputation:

C

Here's my attempt. It works like this:

Build a table of all values of sin(x) from 0 to 450 degrees. Equivalently this is all values of cos(x) from -90 to 360 degrees. With 2926 elements, there is enough space for a value every 1/6.5 degrees. The program unit is therefore 1/6.5 degrees, and there are 585 units in a quarter turn.

Convert input degrees into program units (multiply by 6.5==110.1 binary.) Find the nearest values for sin and cos from the table. then convert the remaining part of the input (dx) into radians.

apply the formula sin(x+dx) == sin x +(d(sin x)/dx)*dx. note that (d(sin x)/dx)==cos x, but only if we use radians.

unfortunately that isn't accurate enough on its own, so another term is required, based on the next derivative d2(sin x)/dx2 == -sin x. This needs to be multiplied by dx*dx/2 (not sure where the factor of 2 comes from, but it works.)

Follow the analogous procedure for cos x, then calculate tan x == sin x / cos x.

Code

There are about 17 floating point operations in here. That can be improved somewhat. The program contains table-building and test output using the native trig functions, but the algorithm does not. I will add timing and edit to comply with I/O requirements later (hopefully this weekend.) It matches the native function output except for very small values of sin x and cos x, which should be improvable to better than the native function output with some tweaking.

<#include <math.h>                                                 //only for table building and testing
int a;
double t[2926],                                                    //a table for sin x from 0 to 360+90=450deg every 1/6.5 degrees
x,                                                                 //input
s,c,                                                               //first guess sin and cos (from table)
sn,cs,                                                             //output sin and cos
pi1170=3.1415926535897932384626433832795/1170,                     // there are 1170 units of 1/6.5 degrees in a half circle 
pi180=3.1415926535897932384626433832795/180;                       // pi/180 only used for testing

main(){
  for (a=0;a<=2925;a++)t[a]=sin(a*pi1170);                         //build table. 

  scanf("%lf",&x);                                                 //input 
  printf("%le %le %le\n",sin(x*pi180),cos(x*pi180),tan(x*pi180));  //native output for testing purposes

  x*=6.5;                                                          //convert from deg to program units. 6.5=110.1 in binary, a fairly round number. 
  a=x+0.5;                                                         //a=round(x) add 0.5 to round, not truncate. Assigning to int, this is probably faster than the round function.
  x=(x-a)*pi1170;                                                  //(x-a)=dx in program units. Divide to get radians. 

  s=t[a];                                                          //get sin(a) from table
  c=t[a+585];                                                      //cos(a)=sin(a+90degrees)=sin(a+585units)
  sn=s+c*x-s*x*x/2;                                                //sin(x+dx)=sin(x)+cos(dx)-sin(dx^2/2)
  cs=c-s*x-c*x*x/2;                                                //cos(x+dx)=cos(x)-sin(dx)-cos(dx^2/2)
  printf("%le %le %le\n",sn,cs,sn/cs);                             //print sin,cos and tan=sin/cos
}

Level River St

Posted 2014-06-04T16:43:53.507

Reputation: 22 049

Fast Trig Calculation

Fast Trigonometry Calculations

Rules

Input Format

Output Format

Scoring

Answers

Fortran 90

Python 2.7.x or Java (Take your pick)

Python Version:

Java Version:

Cobra

Octave (or Matlab) & C

Octave - building part

C - building part

Generating the coefficients file

C - fast-trig part

C